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D-alanyl-D-alanine carboxypeptidase from Bacillus stearothermophilis (Cpase), a 
membrane-bound penicillin-binding enzyme was supplied to us in two forms. The first, a 
native enzyme, was purified from the bacterial membranes. The second was a genetically 
engineered soluble form of the enzyme produced by cloning the gene for Cpase and 
inserting it into the fungus Pichia, which then secreted Cpase into its medium. 

Engineered Cpase was purified from the fungal medium using ion exchange and 
hydrophobic chromatography. Crystals were grown from both the native and the 
engineered enzyme. Some crystal of the native enzyme were used in X-ray diffraction 
experiments. Several Laue photographs and native and platinum data sets complete to 3.5 
A were collected. Data extended to 3.2 A resolution (76% completeness). The platinum 
atoms were located by direct methods and a preliminary molecular boundary was 
obtained. 

During the course of this work various observations were made which lead us to 
suggest that Cpase may consist of two domains. The first 280 N terminal residues may 
form a catalytic domain, having a structure very similar to the water soluble penicillin- 
binding protein R-61 and the water soluble B-lactamases such as that produced by 


Bacillus licheniformis. The second domain of approximately 120 residues might function 
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in positioning the catalytic domain and would terminate in the approximately 26 residues 


which are known to anchor Cpase to the cell membrane. 
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INTRODUCTION 


Penicillin Interactive Enzymes 


The family of bacterial enzymes which interact with B-lactam antibiotics is 
interesting for both scientific and practical reasons. For the past fifty years, B-lactams 
have been of major importance in the control of bacterial infections. This, coupled with 
the fact that in recent years many bacteria which once readily succumbed to treatment 
with B-lactam antibiotics now show resistance to these drugs, gives practical importance 
to understanding all aspects of these enzymes. These enzymes range in size from 25,000 
daltons to 100,000 daltons, and include membrane-bound enzymes as well as others 
which are not. They also vary in function from cell wall synthesis to the defense of the 
organism against B-lactam antibiotics. Yet despite this diversity, penicillin interactive 
enzymes appear to have similar active sites and mechanisms of action. 

The B-lactam interactive enzymes can be divided by their function into two 
groups. The first group is the cell wall synthetic enzymes, called penicillin binding 
proteins (PBP), which are inhibited by B-lactam antibiotics. Each species of eubacteria 
produces several different types of high molecular mass PBP (60 KDa or greater) as well 
as several different types of low molecular mass PBP (less than 60 KDa). Each of these 
proteins has a slightly different function within the cell. The high molecular mass PBPs 
are all membrane-bound, bifunctional (glycosylases/peptidases) enzymes which are 
essential for bacterial life since they are required for the final assembly of peptidoglycan 
which forms part of the bacterial cell wall. They are therefore referred to as killing 
enzymes, since their inhibition results in cell death. Most of the low molecular mass 
PBPs are also membrane bound although a few are water soluble and excreted into the 


bacterial periplasm. These monofunctional enzymes, which function as peptidases, are not 


usually killing enzymes, as many bacteria can survive the inactivation of one or more of 
these proteins. 

The second group, B-lactamases, break the B-lactam ring of penicillin and related 
compounds thus inactivating these antibiotics. The B-lactamases are monofunctional 
enzymes similar in size to the low molecular mass PBP. They are water soluble enzymes 
which are secreted into the bacterial periplasm. In many respects, low molecular mass 
PBPs and B—lactamases appear to be more similar than the low and high molecular mass 


PBPs. 


The Action of Penicillin Interactive Enzymes 


Peptidoglycan forms an integral part of all eubacterial cell walls. While it is of 
lesser importance in gram negative organisms, it can constitute 50 % or more of the gram 
positive cell wall. Bacteria are dependent on their cell walls to protect the cell membrane 
from rupture due the difference in osmotic pressure between the interior of the cell and 
the medium in which it grows. Unless they are held in an isotonic medium, bacteria 
without an intact cell wall burst and die. 

The peptidoglycan layer is built from disaccharides of N-acetylglucosamine and 
N-acetylmuramic acid. These disaccharides are transported to the outside of the cell 
membrane and then linked into carbohydrate chains by B(1—4) bonds. Attached at the 
number 3 carbon of the muramic acid residue is a chain of five amino acids, the last two 
of which are D-alanyl-D-alanine. The final step of cell wall synthesis, a transpeptidase 
reaction, is the cross linking of one carbohydrate chain to its neighbor via peptide bonds 
formed between the penultimate alanine of one muramic acid moiety and an amino group 
usually found on the third amino residue on the neighboring carbohydrate chain. Figure 1 


is an illustration of a peptidoglycan structure. 


CH,OH CHOH 
° 
G-M-G-M-G-M-G on S 
pp 
i y | 
CO i h c=0 
E T” bi 7E CH, 
| Ala-L 
ta 
G-M-G-M-G-M-G D-Glu 
rath 
>D-Ala 
" B 6-Ala 


Figure 1. Peptidoglycan A. Diagram of crossing linking between two polysaccharide chains. Xaa stands for 
a diamino acid, such as diamino pimelic acid, which varies from one species to the next. G stands for N- 
acetylglucosamine and M stands for N-acetylmuramic acid. B. The disaccharide bond formed from N- 
acetylglucosamine and N-acetylmuramic acid. 


In most bacteria both of these steps (the joining of the disaccharides to form the 
polysaccharide, and the transpeptidase reaction) are catalyzed by high molecular mass 
PBPs. The amino-terminal domain of a high molecular mass PBP controls the elongation 
of the polysaccharide chain and is penicillin insensitive. The carboxy-terminal domain, 
which is penicillin sensitive, catalyzes peptide cross linking. These enzymes have been 
difficult to study due to their large size, water insolubility and the fact that they will not 
use simple compounds such as small peptides as substrates for the transpeptidase reaction. 
(See reviews by Frere & Joris, 1985; Ghuysen, 1991; Frere, 1992). 

Most low molecular weight PBPs are able to function in vitro as both 
transpeptidases and carboxypeptidases. That is, they can carry out the final cross linking 
of the peptides in cell wall synthesis or a reaction in which the terminal alanine is 


removed from the peptide chain so that the amount of cross-linking in the peptidogylcan 


is limited. The mechanisms for these reactions and for that of B-lactamases are given in 
figure 2 (Waxman and Strominger, 1983). As can been seen, the mechanisms are closely 


related. 


“he 
Y ws cea oe 


Transpeptidase 


COOH 
E-OH 
Carboxypeptidase 
A. 
R 
° OH 
E-OH 
S 
R A s 
PBP > 
No Rx 
ae 
COOH 
E-OH COOH 
beta-lactamase 
H20 
B. 
N 
S 
Ro 
N 
° 
OH COOH 


Figure 2. A. The Reaction of Carboxypeptidase or Transpeptidase with terminal alanine dipeptide of N- 
acetylmuramic acid. In the case of the transpeptidase reaction, the second substrate is an amino group from 
another peptide chain. For carboxypeptidation the second substrate is water. B. Reaction of B-lactamase or 
PBP with penicillin. The intermediate formed with PBP and a B-lactam antibiotic is very stable, while that 
formed with a -lactamase is short lived. In both A. and B. the reaction mechanism is very similar. The 
oxygen of an activated serine in the enzyme attacks the carbonyl carbon causing the B-lactam ring to open. 
What happens in the second step depends on the availability of a second substrate and the type of enzyme. 


For the PBPs, it is presumed that the ultimate and penultimate alanine residues of the 
peptide side chain of muramic acid fit into the enzyme active site. This is consistent with 
observations made by crystallographers on the formation of acyl enzyme intermediates 
between B-lactams and the low molecular weight PBP from Streptomyces R-61 (Kelly et. 
al., 1986 & 1989). The oxygen of an activated serine from the enzyme then carries out a 
nucleophilic attack on the carbonyl carbon of the peptide bond between the terminal 
alanine residues. This results in the release of free alanine and the formation of acyl- 
enzyme intermediate. A second substrate is then required to complete the reaction. If this 
substrate is an appropriate amino group from another peptidoglycan chain, cross-linking 
occurs and the enzyme functions as transpeptidase. If the second substrate is water, then 
the original peptide is released shortened by a single amino acid residue and no cross- 
linking will be possible at that site. The enzyme in this case is functioning as a 
carboxypeptidase. 

Tipper and Strominger (1965) were the first to suggest that penicillin inhibits 
these enzymes because its three dimensional structure closely resembles that of D-alanyl- 
D-alanine. If penicillin enters the active site, the activated serine will attack the 
carbonyl carbon of the B-lactam ring causing the ring to open and the formation of a 
pencilloyl-enzyme intermediate. In the case of a carboxypeptidase or a transpeptidase, 
this intermediate is stable and the enzyme is inhibited. If however the enzyme is a B- 
lactamase, the intermediate will be short lived and fragments of penicillin are soon 
released. Thus the primary differences between the PBPs and the B- lactamases are found 
in the stability of the acyl intermediate and possibly the binding site for the second 
substrate. 

There has been much interest in studying the low molecular weight PBPs because 
it seems likely that their active sites and mechanisms of action will have much in common 
with transpeptidase site of the high molecular mass PBPs. Jamin and coworkers (Jamin 


et. al., 1993) succeeded in obtaining a fairly large quantity of PBP 2x (a high molecular 


weight PBP which is a lethal target for penicillin) from Streptococcus pneumoniae by 
preparing a genetic construct of the gene lacking the region which codes for the ` 
membrane anchor and inserting this gene into E. coli. Biochemical studies indicated that 
the mechanism of action for this enzyme was the same as that presented above, that is the 
same as that for the low molecular weight PBP. While PBP 2x would not use the simple 
substate diacetyl-L-Lys-D-Ala-D-Ala, which is readily accepted by many low molecular 
weight PBPs, it did use several other substrates in common with the low molecular weight 
PBPs. These authors present evidence for the formation of the acyl-enzyme intermediate 
both with substrates and with B-lactams and also noted that D-histidine and D- 
phenylalanine are among the best second substates for both R-61, a low molecular weight 
PBP (see below) and PBP 2x. 

While most of these low molecular mass PBPs are membrane-bound and are 
therefore water insoluble in their natural state, there are two, one from Streptomyces R61 
and another from Actinomadura R39 (Ghuysen, 1991) which are excreted into the 
bacterial periplasm and are naturally water soluble. Single crystal X-ray diffraction 
studies on the R-61 enzyme have yielded a three-dimensional structure (Kelly et al., 
1982, 1986 & 1994). While the structures of several of the B-lactamases have been 
determined by X-ray diffraction (Herzber & Moult, 1987, Dideberg et al., 1987, Moews 
et al., 1990, Oefner et al., 1990), R-61 remains the only PBP with a known three 
dimensional structure. Such studies have not been completed for any of the membrane 
bound enzymes because of difficulties in obtaining the necessary single crystals of 


suitable quality. 
D-alanyl-D-alanine-carboxypeptidase from Bacillus stearothermophilus 


D-alanyl-D-alanine-carboxypeptidase from Bacillus stearothermophilus (also 


called PBP-5 and here called Cpase) is a membrane-bound carboxypeptidase. It first 


attracted the attention of researchers because more than 90 % of the PBP produced by B. 
stearothermophilus is of this type (Yocum et al., 1974), making it a good candidate for 
biochemical studies because the problems of obtaining an ample yield of pure protein 
were somewhat reduced. In 1979 Waxman and Strominger published a method for 
obtaining water soluble Cpase in which the enzyme was first extracted from the bacterial 
membranes with the non-ionic detergent, triton. Cpase was then covalently bound to a 
penicillin affinity resin, and while still bound to the resin it was treated with trypsin or a- 
chymotrypsin. Both trypsin and a.-chymotrypsin are serine proteases, but they have 
different specificities. The substrate binding site of «—chymotrypsin contains a large 
pocket to accommodate the bulky hydrophobic side chains of tyrosine, phenylalanine or 
tryptophan; while trypsin has an aspartic acid in the bottom of its substate binding pocket 
and therefore binds substrate with positively charge residues, (lysine or arginine). Thus 
these enzymes cleave peptides at different points. The resin was next washed to remove 
triton and then the Cpase was eluted with hydroxylamine, which acts as a second substate 
and releases the enzyme from the matrix-bound penicillin. Using this method they were 
able to obtain Cpase with an apparent molecular mass of about 45,000 daltons as 
compared to 46,000 daltons for the native enzyme. These molecular masses were 
determined by the mobility of the enzymes in a polyacrylamide gel compared to the 
mobility of standards of known molecular mass. Under some conditions low 
concentrations of fragments having apparent molecular mass of 30,000 daltons and 
15,000 daltons were observed. These fragments retained the ability to bind penicillin. 
These authors also report that Cpase and the fragments of Cpase are basic proteins with an 
isoelectric points somewhere around 10. Good resolution of these proteins was obtained 
from electrophoresis on an agarose gel at pH 9.0. Studies on the catalytic activity of 
water soluble Cpase show few differences from the native enzyme. The water soluble 
enzyme was able to cleave the terminal alanine reside form UDP-N-acetylmuramy]-L- 


Ala-D-Glu-L-Lys-D-Ala-D-Ala, a natural substate, and from diacetyl-L-Lys-D-Ala-D- 


Ala, a synthetic substrate at approximately twice the rate of the native enzyme when the 
native enzyme was solubilized with triton. The water soluble enzymes also bound 
penicillin and eventually released the same fragments of penicillin as the native enzyme. 
Sequence analysis of the amino-terminal region of the water soluble and the native 
enzymes revealed that they were the same. Therefore, these authors concluded that 
digestion must take place at the carboxy-terminus of the protein. 

In a subsequent paper, Waxman and Strominger (1981) reported the amino acid 
sequence of the cleaved carboxy-terminal fragment. Treatment of Cpase with œ- 
chymotrypsin results in the formation of a large water soluble fragment slightly smaller 
than the original protein and four other small fragments (containing 2, 4, 9 and 11 
residues respectively). Analysis of these fragment and of the 30 carboxyl-terminal 
residues of intact Cpase lead the authors to conclude that the water-soluble fragment was 
shorter than the original peptide by 26 residues, and that the primary cleavage site was 
between the 26 and 27" amino acid residue from the carboxyl terminus. While the four 
small fragments reveal the presence of four cleavage sites, the authors reasoned that even 
when the first cleavage does not occur between the 26" and the 27" residue, digestion 
would continue to that point. ‘They also concluded that these 26 carboxyl-terminal 
residues anchor Cpase in the cell membrane. Examination of the amino acid sequence 
suggested that this carboxyl-terminal tail interacts with the surface of the membrane 
instead of penetrating it. Waxman and Strominger suggested an at-helix for the 
secondary structure of this region. 

R. Charnas from Hoffmann-LaRoche in Basel, Switzerland, prepared Cpase from 
Bacillus stearothermophilus using a method similar to that of Strominger and Waxman 
(personal communication). This material was made available to our laboratory at the 
University of Connecticut. 

Work on this project was later continued at Hoffmann - LaRoche in Nutley, New 


Jersey where R. Manning and C. Despreaux sequenced and then cloned the gene for 


Cpase omitting the hydrophobic tail and inserted it into the fungus Pichia. The fungus 
then excreted Cpase into its medium. Samples of this preparation were also provided to 
us at the University of Connecticut. 

The in vivo function of Cpase is poorly understood. As mentioned above it is 
clearly able to function as a carboxypepidase in vitro. Many authors have suggested that 
Cpase also acts as a carboxypeptidase in vivo, limiting the amount of cross linking in the 
peptidoglycan layer and thereby controlling the shape of the cell. Others have suggested 
that under some condition in vivo Cpase may function as a transpeptidase. 

There is some evidence that a similar enzyme controls cell shape in Æ. coli where a 
10-fold over-production of PBP 5 caused a change in shape from the normal rods to round 
cells (Stoker et al., 1983). Despreaux and Manning (1993) inserted the gene for Cpase 
from Bacillus stearothermophilus into E. coli in such a fashion that the gene could be 
induced. Induction, which caused the production of Cpase, was followed shortly by cell 
lysis. Cpase from Bacillus stearothermophilus is apparently lethal for E. coli. 

In this regard it is also of interest that during spore formation in the closely related 
species Bacillus subtilis there is a shift in the type of PBP 5 formed. In forming spores, 
the amount of PBP 5 decreases, as PBP 5* appears. PBP 5* has a slightly lower 
molecular weight that PBP 5 and slightly different activities (Buchanan and Neyman, 
1986) 

Thus the real function of Cpase, the reason why it is present in so many more 
copies than the high molecular weight enzymes in Bacillus stearothermophilus, remains 


something of a mystery. 
X-ray Diffraction Studies of Single Enzyme Crystals 


X-ray diffraction studies of single crystals of any molecule provide an electron 


density map of that molecule. In the case of proteins such maps can be used to trace the 


10 


path of the peptide backbone. If the amino acid sequence of protein is known, it is often 
possible to fit this sequence into the electron density, allowing determination of the 
secondary and tertiary structure of the non-hydrogen atoms of the protein. It is often 
possible to obtain a clear picture of the active site, both in the native form and containing 
bound inhibitor. These images when coupled with the results of biochemical studies may 
allow a detailed understanding of the reaction mechanism for the enzyme. In the case of 
the PBPs this information would be invaluable for designing new drugs (enzyme 
inhibitors). Such images will also show the relationship between different domains 
within a protein and the location of flexible regions, so that it may be possible to 
determine how the protein fits into its environment. 

In order to obtain X-ray diffraction images, large single crystals of an enzyme are 
required. Although the size of the crystal required varies with certain conditions 
discussed below, in general crystals at least 0.15 mm in all three directions will be 
necessary. In order to obtain these crystals, large quantities of very pure enzyme are 
required. In addition it is a great advantage to have the enzyme in water soluble form and 
free of detergent. (See Methods and Results.) 

Among the very interesting results of X-ray studies on penicillin interactive 
enzymes is the discovery of the structural similarities between R-61, a transpeptidase, and 
the B-lactamase from Bacillus licheniformis (BL). These enzymes have little homology in 
their amino acid sequence except at certain specific locations, most notably at the active 
serine where they both have serine-X-X-lysine (see the section on Choice of a Model 
under Molecular Replacement). Yet their tertiary structures are strikingly similar (see 


figure 3). 


PI 


BL R-61 


Figure 3. Ribbon drawings of BL and R-61. œ—helix 2 is labeled in both cases as is the active serine, $70 
on BL and S 62 on R-61 


Both enzymes contain a five stranded antiparallel B-sheet with two a-helices on one face 
and one &-helix on the other. The relative positions of these sheets and a-helices are so 
similar that displays of them on a graphics terminal can be virtually superimposed. Both 
structures contain additional a-helices, four of which are in essentially the same relative 
positions in their respective proteins (Kelly ef al., 1986). In order to gain a better 
understanding of the active sites of these enzymes, R61 was complexed with three 
different B-lactam inhibitors prior to being studied with X-ray diffraction (Kelly et al., 
1989). Because B-lactams are inhibitors for PBPs such as R-61, these complexes have 


half-lives lasting several days, allowing sufficient time for the collection of an X-ray 
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diffraction data set. In all three cases an acyl-enzyme intermediate was observed in which 
the y-oxygen of the active serine (residue number 62) was bound to the carbonyl carbon 
of the now open -lactam ring. These results were confirmed in studies using additional 
B-lactams (Zhao, 1988 & Lui, 1990). The active serine is located at the terminus of a- 
helix 2. Located just above the active serine is the lysine residue which is a found three 
residues down stream from the active serine in all PBPs and B-lactamases. It was now 
clear why there is so much variation in the two residues which intervene between the 
active serine and conserved lysine, they are on the opposite side of the a-helix from the 
active site. Three residues, histidine 298, threonine 299 and glysine 300 (in R-61) located 
along the inner strand of the B-sheet appear to have important functions. Histidine and 
threonine appeared to form hydrogen bonds with the substrate, thus holding it in the 
proper position, while glycine which is located near the active serine and may be required 
in that location to avoid steric interference with the active site. R-61 is the only penicillin 
interactive protein with histidine at this position; all others have lysine on the inner B- 
strand. 

It is interesting that these maps of R-61 with bound B-lactam could be used to 
model B-lactams into the active site of a high resolution map of the B-lactamase BL 
(Moews et. al., 1990). As in R-61, the active serine at the end of a-helix 2 would be in 
position to bond with the substrate with the lysine residue located directly above it in the 
helix. On the adjacent B-sheet are located lysine, threonine and gylcine. Apparently the 
lysine in BL has a similar function to the histidine of R-61. 

There must of course be some differences between a transpeptidase such as R-61 
and a B-lactamase such as BL Jamin et. al., 1991 present good evidence that this 
difference is due in part at least to the binding site for the second substrate. BL essentially 
lacks a binding site for this substrate while R-61 appears to have a rather large one as 


bulky amino acids such as histidine and phenylalanine are as readily used as second 


substrates as alanine. 
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Outline of the Work in this Thesis 


Genetically engineered Cpase from Bacillus stearothermophilus was extracted 
from the Pichia medium obtained from R. Manning and C. Despreaux of Hoffmann- 
LaRoche in Nutley, New Jersey and purified. Crystals were grown from this preparation 
but their quality was not sufficient to allow X-ray studies with the available equipment. 
Data obtained from these experiments were analyzed in order to make recommendations 
for future experiments. 

Crystals were grown from enzymatically - solubilized Cpase prepared by R. 
Charnas in Basel. Some of these crystal were of sufficient quality to allow X-ray 
diffraction experiments. The data from these experiments were analyzed and an electron 


density map which showed molecular boundaries was obtained. 


METHODS AND RESULTS 
Purification of Genetically Engineered Cpase 


In order for crystals to form in an enzyme solution, the enzyme in that solution 
must usually be of very high purity and at a concentration above Smg/ml. The presence 
of different forms of an enzyme often disrupts the crystal lattice as it is being formed, 
fitting into it poorly so that only small crystalites or precipitate is formed. In the early 
days of protein chemistry, showing that crystals would form in an enzyme solution was an 
accepted demonstration of the purity of proteins. More recently there are numerous 
examples in which crystals could not be obtained from a protein solution. The proteins in 
the solution were carefully examined and found to be heterogeneous. The preparations 
were then treated to remove the heterogeneity, after which protein crystals were readily 
obtained (Giege & Mikol, 1986). 

A troublesome problem in considering Cpase for crystallization experiments is 
that the native enzyme contains a hydrophobic carboxy-terminal region of about 26 
residues which is believed to anchor the enzyme in the cell membrane and which renders 
it water insoluble. The enzyme can be solubilized by enzymatic cleavage of the carboxy- 
terminal residues ( Waxman and Strominger, 1979). However, as there is more than one 
cleavage site, the product of this procedure may be heterogeneous. Elimination of this 
heterogeneity is desirable in the preparation of enzyme for crystallization trials. For this 


reason the work of C. Despreaux and R. Manning at Hoffmann - La Roche in Nutley, N.J., 


was of interest to us. They isolated and sequenced the Cpase gene, modified the start 
codon so it would be recognized by a eucaryotic system, eliminated the codons for the 24 
carboxy-terminal residues, and inserted this gene into the yeast-like fungus Pichia 
pastoris (Despreaux & Manning, 1993). The Pichia then expressed and secreted water- 


soluble Cpase into its medium. Despreaux and Manning were able to identify Cpase on 
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sodium dodecyl sulfate polyacrylamide gel electrophoresis (SDS PAGE) by its molecular 
mass and its reaction with antiCpase antibody. They were kind enough to supply us with 
the crude Pichia medium for our purification trials. The medium was available at first in 
several small batches, the results of their own early small scale experiments, and later as 
the first and second harvest of a large scale fermentation. The first harvest from the large 
scale fermentation was denoted Sample A, and the second harvest was Sample B. Sample 
A contained approximately 170 mg of Cpase as determined by penicillin assay 
(Despreaux and Manning, personal communication). This sample was used in the 


purification protocol described below. 


Purification Protocol 


1. Pichia medium containing Cpase was treated with 80% ammonium sulfate at 


Hoffmann - La Roche. The resulting precipitate was resuspended in 0.05 M 


[tris(hydroxymethy])aminomethane] (Tris) with 0.05 M sodium chloride at pH 7.5, frozen 


and then shipped to the University of Connecticut. 


2. At the University of Connecticut the material was thawed and then dialyzed against 
four changes of one liter each of 0.02 M [2-(N-morpholino)-ethanesulfonic acid] (MES) 
buffer pH 6.5 with 0.02 M sodium chloride in the cold room. The initial volume of the 
Sample A slurry was 60 mls. This volume increased to 190 mls during dialysis. At the 
end of dialysis, there was a small amount of white precipitate in the dialysis tubing, which 
resisted all efforts at dissolution. Eighty mls of dialyzed Sample A were used in the 


single large preparation described in this protocol (see figure 4). 
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Step 1 Ammonium Sulfate Precipitation 
tep 2 t 
Step Dialyzed 80 mls of Sample A 
Cpase 72 mg 


i 


Step 3 Applied to Mono - S as 4 separate runs 
Run 1 Run 2 Run 3 Run 4 
5 mls 25 mls 25 mls 25 mls 
Step 4 SDS - PAGE of eluent from Mono-S 
Run 1 Run 2 Run 3 Run 4 
7 mls 8 mls 11 mls 9 mls 
OD 0.40 OD 1.82 OD 1.04 OD 1.44 
Total Cpase recovered from four runs of Mono-S 32 mg 
Step 5 Dialyzed against 1.7 M Ammonium Sulfate 
Step 6 Applied to 
Phenyl Superose 
Step 7 Dialyzed against 0.02 M Tris pH 7.0 
Run | Run 2 Run 3 Run 4 
15 mls 15.6 mls 25 mls 3.8 mls 


OD 0.11 OD 0.352 OD 0.16 OD 0.39 


l 


Total 
Cpase recovered 
12 mg 


Figure 4. Flow chart for the purification of the genetically engineered Cpase 
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The precipitate and any other particulate matter was removed by centrifugation for 20 
minutes at approximately 20,000 X gravity ( 60 Ti rotor in a Beckman L7 ultracentrifuge 
run at 20,000 RPM). At this point the medium was a greenish/brown color with a strong, 
unpleasant odor. It had a very high optical density at 280 nm, most of which appeared to 
be due to non-protein material since most of this optical density passed directly through 
the cation exchange column described below and appeared in the wash figure 5 which 


showed very little protein on SDS - PAGE see figure 6. 


3. Cation exchange chromatography was carried out using a Pharmacia Mono-S 10/10 
cation exchange column on a Fast Protein Liquid Chromatography (FPLC) apparatus ( see 
Pharmacia manuals for details of operation) at pH 7.5 using 0.050 M phosphate buffer 
with 2% V/V ethanol. No more than 25 mls of dialyzed sample could be applied for a run 
without exceeding the capacity of the column. Sodium was used as the counter ion anda 
gradient from 0 to 0.25 M sodium chloride was run in 50 mls of buffer. The flow rate was 
3.0 mls per minute, and 1.0 ml fractions were collected. Ethanol was added because 
pressure within the system increased as the material was passed through the column. This 
indicated that some component was sticking to the column and restricting flow. 
Presumably the ethanol dissolved this material since addition of ethanol helped to 
maintain the flow rate and did not appear to have any effect on separation. A typical plot 
of absorbance at 280 vs. effluent volume for the 10/10 Mono-S under these conditions is 
shown in figure 7. In this case the major peak emerged at 0.20 M sodium chloride with a 
shoulder at 0.21 M sodium chloride. There were also two smaller peaks before the main 
one. While the salt concentration at which these peaks emerged varied somewhat from 
one run to the next, the shape of the curve was consistent, always showing small initial 


peaks followed by the major one with a shoulder on the high ionic strength side. 
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Figure 5. Eight mls of crude Cpase A were loaded in two 4 ml portions onto an FPLC 10/10 Mono-S column 
at pH 7.5. O.D. at 280 nm is represented by line which shows fraction numbers. The second line shows the 
NaCl gradient. The twin peaks which occur before the start of the gradient are the wash (ie material which 
did not bind to the column). The single binding peak occurs between [NaCl] of 0.08 and 0.14 M. 
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Figure 6. SDS PAGE of fractions collected from the column in figure 5, stained with Coomassie Blue. Lane 
1 Crude Cpase A. Lane 2 twin wash peaks. Lanes 3 and 4, fractions 13 and 14. Lane 5 empty. Lane 6, 
fraction 15. Lane 7 molecular weight standards. Lane 8 Cpase from R. Charnas. Lane 9 fraction 16. Lane 
10 empty. Lanes 11-13 fractions 17, 18 and 19. While Cpase gives an appearent molecular weight of 
approximately 46 Kda on SDS PAGE, the actual molecular of the genetically engineered Cpase as 


determineded by sequence analysis is 43K. 


19 


news ` 6 RA Oe oR OR ee ee 
34 


11 
10 12 14 16 18 20 22 24 26 28 30 32 36 40 


Figure 7. Twenty-five mls of crude Cpase A were applied to a 10/10 Mono-S column. The numbers on the 
bottom of the curve refer to the fractions collected. See text for details. 


Figure 8a Figure 8b 


Figures 8a and 8b. SDS PAGE of fractions obtained during the run on the Mono-S column shown in figure 
7 stained with Coomassie Blue. 8a: Lanes 1 - 4 fractions 24, 26, 28 and 29; Lane 5, Cpase form R. Charnas; 
Lane 6, molecular weight standard; Lanes 7 - 10, fractions 30, 31, 32 and 34. 8b: Lane 1 molecular weight 

standard, Lane 2 1/5 dilution of fraction 30 as show on 8a. 
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4. Those fractions from the Mono-S with absorbance at 280 nm were subjected to SDS- 
PAGE (see figure 8 a and b). The major band on these gels was at 43 kD, the molecular 
weight of Cpase without the 24 carboxy terminal residues. There is a second band at about 
30 kD as well as a minor band at just over 43 kD (43+ kD). While the 43+ kD band 
cannot be distinguished on the gel in figure 8 a, it can be seen clearly in figure 8 b which 
shows a five fold dilution of fraction 30 from figure 8 a. The 43+ kD band may be the 
result of an occasional failure to cleave the signal peptide as Cpase is transported across 
the Pichia cell membrane ( R. Manning, personal communication). 

Some insight into the origin of the 30 kD protein was gained by examining its 
presence in a number of samples. Figure 9 shows a plot of absorbance at 280 nm vs 
effluent volume for a portion of Sample B applied to a Mono-S column while figure 10 
shows the SDS PAGE patterns for the collected fractions. It is evident that Cpase is only 
one of several proteins present in Sample B and that the concentration of the 30 kD 
protein is higher than in sample A. This portion of sample B had been kept frozen until it 
was applied to the Mono-S column. If, however, a sample was kept in the cold room for 
several weeks, a different pattern was seen on SDS PAGE after passage over the Mono-S 
column (see figure 11). The 30 kD band is more prominent than the 43 kD band. This, 
along with the fact that it was not possible to chromatographically separate the 30 kD 
band from the 43 kD band, lead to the idea that the 30 kD band may be a degradation 
product of Cpase and that one of the other proteins present in the crude material is a 
protease that nicks Cpase which remains intact until it is denatured in preparation for SDS 


PAGE. 
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Figure 9. Thirty mls of Cpase B were applied to a 10/10 Mono-S column. See text for details. 


Figure 10, SDS PAGE of fractions collected during the run of the Mono-S column shown in figure 9, 
stained with Coomassie Blue. Lane 1, crude Cpase B; Lane 2, early wash; Lane 3, later wash; Lanes 4 - 6, 
fractions 10, 11, 12; Lane 7, molecular weight standard; Lane 8, Cpase from R. Charnas; Lanes 9 - 13, 
fractons 13 - 17. 
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Figure 11. SDS PAGE showing Cpase B stored in the cold room prior to ion exchange chromatography, 
Lane 7. Lane 1, crude Cpase B; Lane 3, Cpase from R. Charnas; Lane 4, molecular weight standard. The 
gel is stained with Coomassie Blue. 


Evidence supporting this hypothesis is shown in figure 12. A portion of Sample B 
was passed over the mono-S and then subjected to SDS PAGE. It is evident that the 
sample so prepared contains 43 kD, 30 kD, and a 14 kD proteins and very little, if any, 
other protein. A sample of the partially purified material was then allowed to sit at room 
temperature for more than one month. At intervals aliquots were removed and frozen 
until the end of the experiment. As can be seen, the ratio of 30 kD to 43 kD protein 
remained constant over time which is consistent with the idea that a protease, which is 
removed during the first ion-exchange chromatography step, is responsible for the 
degradation of the 43 kD protein into 30 kD and 14 kD proteins. It also indicated that 
Cpase was stable at room temperature once the other proteins are removed from the 
sample. This is important because crystal growing requires protein to remain unfrozen for 


weeks or months. 


re ~ ~~ nam 
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Figure 12b 


Figure 12. SDS PAGE on Cpase B Samples allowed to sit at room temperature. See text for details. On both 
gels, Lane 1, Cpase A; Lane 2, Cpase B; Lanes 4 and 6 molecular weight standards; Lane 5, Cpase from R. 
Charnas. Figure 12a: Lanes 3, 7 and 8 Cpase B at room temperature for 0, 5 and 9 days respectively. Figure 
12b: Lanes 3, 7 and 8 Cpase B at room temperature for 13, 20 and 25 days repectively. Both gels are stained 


with Coomassie Blue. 
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A nitrocefin assay was performed on the partially purified 43 kD protein in order 
to demonstrate Cpase activity. This assay was chosen for its procedural simplicity even 
though it requires large amounts of enzyme. Nitrocefin is a B-lactam and therefore a 
Cpase inhibitor. Upon binding to the Cpase, the B-lactam ring is broken with the 
subsequent shift in color from the yellow associated with the nitrocefin to red, the color 
associated with nitrocefin when the B-lactam ring is broken (see figure 13). The inhibitor 
is then slowly released by Cpase so that other molecules can be hydrolyzed and there is, 
therefore, a gradual increase in red color as Cpase is incubated with nitrocefin. This 
increase in red color can be followed spectrophotometrically at 480 nm. The Cpase used 
in this experiment came from fractions 13 and 14 of the run of the Mono - S shown in 
figures 9. These fractions were combined, their volume was reduced to 0.5 mls and the 
buffer was changed to MES at pH 6.0. The following procedure was used. A solution of 
0.1 mM (0.3 mg/ml) of nitrocefin in MES buffer pH 6.0 was prepared. As nitrocefin is 
not directly soluble in water, it must first be dissolved in a small amount of dimethyl] 
sulfoxide (DMSO) and then added to the buffer. Nitrocefin samples (350 ul) were 
incubated with 50 ul of protein solution for 2 hours and 15 minutes in a water bath at 30° 
to 32° C. The protein solution was prepared by adding a volume of Cpase as prepared 
above to sufficient water to give a total volume of 50 ul. At the end of the incubation 


period, the O.D. at 480 nm was read and recorded. 


Figure 13. The opening of the -lactam ring of | 
nitrocefin causes a change in color from yellow 
to red. (O'Callaghan et al., 1972, O'Callaghan, 1979) 


25 


Figure 14 shows a graph of the ul of Cpase vs. O.D. at 480 nm at the end of the 
hour incubation period. There is a linear relationship between the amount of protein 
solution used and the O.D. at the end of incubation. Since the 43 kD protein is the major 
constituent of this sample (see figure 10) and since Cpase is known to have a molecular 
weight of 43 kD, these results provide good evidence that the 43 kD protein is an active 
Cpase. It is also of interest to note that within 48 hours all samples containing Cpase are 
noticeably pink while the blanks retain their original yellow color associated with 


unhydrolyzed nitrocefin. 
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Figure 14. Nitrocefin assay of Cpase B See text for details. Y = 0.092 + 0.006 slope = 0.0018 + 0.0002 
The square and diamond symbols represent duplicate trials 
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5. The purpose of the next step was to separate the 43+ and the 43 kD proteins. 
Preliminary experiments with HPLC (High Pressure Liquid Chromatography) had shown 
that Cpase will bind to and elute from a C-4 hydrophobic column; however a significant 
amount of protein became denatured during the procedure which required trifluoracetic 
acid in acetonitrile and trifluoracetic acid in water as a solvent system. Therefore it 
seemed reasonable to try a hydrophobic column which used a gentler solvent system in an 
attempt to separate the 43 kD and the 43+ kD proteins. After a preliminary experiment 
which demonstrated that Cpase could bind to and be eluted from phenyl sepharose with 
no apparent ill effects, a sample of Cpase was applied to a phenyl superose 5/5 column on 
the FPLC. The only difference between phenyl sepharose and phenyl superose is the type 
of matrix; the functional phenyl groups are exactly the same. In this case fractions 27 -37 
from the run of the Mono- S column described in figure 7 were combined and dialyzed 
against the starting buffer for the phenyl superose hydrophobic column (1.7 M 
ammonium sulfate in 0.05M phosphate at pH 7.0) The volume of the combined fractions 
was about 11 mls. After dialysis the volume was 5.8 mls, a decrease of about 50 % which 
was typical. The optical density of the 11 mls before dialysis was 1.044 at 280 nm, 
indicating that the sample contained approximately 11.5 mg of protein, assuming that 
1mg/ml of protein will give an absorbence of 1 (Cantor and Schimmel, 1980). Since 
Pharmacia lists the loading capacity for the 5/5 phenyl superose column as 5-10 mg of 


protein, half the material was loaded at a time. There was no evidence of unbound protein 


as the column was washed. 


6. After a sample was loaded on to the phenyl superose 5/5 column, the column was 
washed and then protein was eluted by lowering the concentration of ammonium sulfate 
in the eluent. A two step gradient was run. First 10 mls of 1.5 M ammonium sulfate were 
run through the column followed by 15 mls of 1.3 Mammonium sulfate (see figure 15). 


The flow rate was 0.5 mls per minute and 0.5 ml fractions were collected. Fractions 
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showing absorbance at 280 nm were subjected to SDS-PAGE (see figure 16). The phenyl 
superose column separated the two proteins with molecular weights close to 43 kD so 
that the component with the slightly higher molecular weight emerged first, followed by 


the one with the lower molecular weight. There was some variability in when the Cpase 


began to emerge from this column. Occasionally it was necessary to run the second step 
for a longer period of time. The situation is not improved by running a linear gradient, as 
shown in figure 17. Figure 18 shows the protein patterns of column fractions for this 
sample. Again, it can be seen that the higher molecular weight material elutes from the 
column before the lower; however, the number of fractions containing Cpase is greater 
and therefore the net result of running a linear gradient was a greater dilution of the 


protein. 
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Figure 15. Partially purified Cpase was applied to a 5/5 phenyl superose column and a step gradient was run. 
The numbers at the bottom of the tracing indicate the fraction numbers. See text for details. 
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Figure 16. SDS PAGE of the fractions collected from the phenyl superose column shown in figure 15. 
Lanes 1 - 4, fractions 30, 32, 34 and 36; Lanes 5 and 14, Cpase from R. Charnas; Lanes 6 and 15, molecular 


weight standard; Lanes 8 - 13, fractions 38, 40, 42, 44, 48 and 46. Note that lane 2 shows a double band. 
The gel is stained with Coomassie Blue. 
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Figure 17. Partially purified Cpase was applied to a 5/5 phenyl superose column and a linear rr was 
run. The numbers at the bottom of the tracing indicate the fraction numbers. 
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Figure 18. SDS PAGE of the fractions collected from the phenyl superose column shown in figure 17, 
stained with Coomassie Blue. Lanes 1 and 2 fractions 22 and 26, Lane 3 empty; Lanes 4 and 5 fractions 28 
and 30, Lanes 6 and 16 Cpase from R. Charnas; Lanes 7 and 17 molecular weight standards; Lanes 8 - 15, 
fractions 34, 36, 38, 40, 42, 44, 46; Lanes 18 - 21, fractions 48, 50, 54 and 56. 
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7. Fractions from each run of the phenyl superose column showing only the lower 
molecular weight 43 kD component were combined and dialyzed against three 0.5 liter 
changes of 0.02 M Tris with 0.02 % sodium azide at pH 7.0. A small portion (0.1 to 0.05 
mg) of each sample was applied to a 5/5 mono-S on the FPLC in order to examine its 
purity. A linear gradient was run from 0 to 0.45 M sodium chloride with a flow rate of 
1.0 mls per minute. Since the amounts injected were typically between 0.05 and 0.1 mg, 
fractions were not collected. In general, a single sharp peak such as that shown in figure 
19 was obtained although the exact sodium chloride concentration at which the peak ° 
appeared did vary. This variation was possibly due to a minor leak in the gasket of one of 


the pumps. 


Figure 19, Purified Cpase was applied to a 5/5 Mono - S Column. A single protein peak appeared at 0.32 M 
NaCl. Fractions were not collected. See text for details. 


The four samples indicated at the bottom of figure 4 showed single, sharp peaks. They 


were concentrated using Amicon microconcentrators which allowed any molecule with a 


molecular weight less than 30 kD to pass through. When the sample volume was reduced 


tion was 
sufficiently to give protein concentrations of 1 mg per ml or greater, concentra 


stopped and SDS-PAGE was performed on the separate samples (see figure 20). The four 


wn in figures 4 and 20) were combined and further 


concentrated samples (runs 1 - 4 sho 
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concentrated to a final volume of about 1 ml and a protein concentration of approximately 
10 mg/ml. SDS PAGE on the combined and concentrated sample used for crystal growth 
showed a single sharp protein band (see figure 21). 

For a general discussion of protein purification see Deutscher (1990). Details of 


the methods used here in relevant Pharmacia manuals. 


Figure 20. SDS PAGE of Runs 1 - 4 of purified Cpase shown in figure 4. Lane 1, Run 3; Lane 2, Run 1; 
Lane 3, molecular weight standards; Lane 4, Run 2; Lane 5, Run 4. 
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Figure 21. Purified Cpase samples from Runs 1 - 4 (see figure 4) were combined and subjected to SDS- 
PAGE. Lane 3, Cpase from R. Charnas; Lane 4, molecular weight standards; Lanes 1, 2 and 5 show i 
different dilutions of the purified Cpase, with Lane 1 the most concentrated, Lane 2 a 1/2 dilution and Lane 
a 1/10 dilution of the Cpase in Lane 1. 
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Other Results 


The small batches of Pichia medium originally supplied to us by Hoffmann - La Roche 
were use to experiment with several other purification techniques. While these methods 
did not prove useful in the final purification protocol, in some cases, the results obtained 
were interesting. 

Two hundred milliliters of Pichia medium containing approximately 35 mg of 
Cpase were applied directly to a Zeta Prep cationic exchange cartridge. A linear gradient 
from 0 to 1.0 M sodium chloride was then run and fractions were collected. All of the 
absorbance at 280 nm appeared in the wash. The cartridge was then stripped with 2M 
sodium chloride and again no protein was found adhering to the cartridge. The experiment 
was run at pH 7.5, the same pH as was used with the Mono-S. The medium in this case 
had not been treated with ammonium sulfate and was very thick. It seems probable that 
some components of the medium coated the resin of the cartridge so that the functional 
groups were not available to bind protein. 

As mentioned in part 5, HPLC was tried using a hydrophobic C-4 column with 
0.1% trifluoracetic acid in acetonitrile as the starting buffer and 0.1% trifluoracetic acid in 
water as the final buffer, with a linear gradient between them. Cpase bound to and was 
eluted from the column with no difficulty. But when the water and trifluoracetic acid 
buffer were replaced with Tris at pH 7.5, a heavy precipitate appeared. Some of this 
precipitate redissolved when 0.1 M sodium chloride was added to the Tris buffer. HPLC 
using this solvent system denatured Cpase in such a fashion that it was not readily 
renatured. The other problem with the method was that only small amounts of material 
could be passed through the column during a single run. 

An attempt was made to separate the 30K and 43K proteins using a penicillin- 


affinity resin. SDS-PAGE of the protein eluted from the resin showed both the 30K and 
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the 43K proteins. Thus affinity chromatography did not appear to hold much promise for 
separating the 30K and 43K proteins 

Trials with the Mono-S column were initially conducted at pH 5.5 and 6.5 as well 
as at pH 7.5. At pH 5.5 two overlapping peaks appeared which contained both the 43 
KDa and the 30 KDa protein as well as proteins of other molecular weights. At pH 7.5 
only the 43 KDa and the 30 KDa protein adhered to the column. These emerged as a 
single peak as described earlier. 

Two attempts were made to do two dimensional gels with isoelectric focusing in 
the first dimension and SDS PAGE in the second. The first attempt, using Cpase from 
sample B which had been passed over a Mono - S column, showed an unfocused band at 
43 kD and a streak at 30 kD which had three areas of heavy protein concentration located 
at the basic end (near pH 9.0) of the gel. There was one addition isolated spot just over 30 
kD also at the basic end of the gel. The second attempt, using Cpase from sample A which 
had been passed over a Mono - S column, showed only an unfocused band at 43 kD. This 
suggests that Cpase may have an isoelectric point above pH 7.5, as many basic proteins 
fail to focus because of the instability of the basic region of the isoelectric focusing 
gradient ( O'Farrell ef al., 1977). This is consistent with the fact that Cpase has sufficient 
positive charge to bind to a cationic exchange column at pH 7.5 and with the report of 
Waxman and Strominger (1979) that membrane-derived Cpase has isoelectic point 


somewhere around 10. 


Crystallization of Cpase 


Protein crystals are often grown by the vapor diffusion method in which a small 


drop of buffer (commonly 10 microliters or less) containing enzyme is placed on a 
microscope slide cover slip. The slide is then inverted and suspended over a tissue culture 


well (see figure 22). High vacuum grease or oil is placed between the cover slip and the 
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well to form a seal. In addition to protein and buffer, this drop normally contains a 
hydrophilic precipitant such as polyethylene glycol and/or water-soluble salts. The tissue 
culture well contains a relatively large volume (usually one milliliter) of buffer at a higher 


ionic strength than the drop, therefore water moves via vapor diffusion from the drop to 


the well. As this happens the concentration of enzyme and precipitant in the drop 
increases, finally reaching a point where the enzyme is no longer soluble. If this process 
proceeds slowly enough, it may result in the formation of well ordered crystals rather than 
amorphous precipitate. Good discussions on the growth of protein crystals are given by 
McPherson (1982), Carter (1990) and Weber (1991). 


Drop Containing Protein Solution 


Figure 22. Diagram of vapor diffusion 


í ‘ Well Solution 
apparatus for growing protein crystals. 
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Although crystal growing is a empirical exercise with few inviolate rules, there are 
some general strategies which give it some order. One of these is to perform a series of 


solubility experiments on a new protein preparation in order to determine the lowest 


precipitant concentrations at which the protein will come out of solution. Drops are then 


set up so that they will slowly approach these conditions. An experiment of this type can 


be done with very little protein. For example, tissue culture wells are prepared which 


contain buffer and varying concentrations of precipitant. Then an aliquot of about three 


microliters of protein (the protein concentration is normally 5 mg/ml or greater) is placed 


on a cover slip and one-half to one microliter of a solution containing buffer and a 


i lip is 
calculated amount of precipitant are added. Next, to prevent evaporation, the cover slip 
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inverted over the well containing the same final concentration of precipitant as is present 
in the drop, and the drop is observed under a microscope. If no precipitation is observed 
the cover slip is removed from the well and another aliquot of precipitant is added and 
again the cover slip is inverted over a well having the same concentration of precipitant as 
the drop. This process is continued until precipitation is observed or the maximum usable 
concentration of precipitant in the drop has been reached. The maximum usable 
concentration of precipitate is the concentration at which that substance becomes 
saturated. As it is often difficult to work with solutions which are near their saturation 
point, the hope is to find a precipitant which works well considerably below that point; 
however that may not always be possible. If precipitation does occur, the drops can be 
observed for several days to see if the precipitate dissolves over time or if needles or other 
small crystals start to appear in the drop. Because vapor diffusion often produces slightly 
different results than batch precipitation and because the protein concentration, which is 
not controlled for in the preliminary experiment, affects crystal formation, only 
approximate conditions for crystal growth can be determined in this way. 

A second strategy is to have the well solution at the final desired precipitant 
concentration for the drop and then make the drop by using one half the well solution and 
one half protein solution. Vapor diffusion will continue until the activities of the two 
solution are the same. The advantage is that if crystals do form it can be assumed that 
they formed in precipitant concentrations similar to that of the well. In this approach, it is 
easier to relate the results of this experiment to other experiments because the final 
crystallization condition can be easily described. This method is particularly useful 


where the protein solution contains nothing other than protein and a low concentration of 


buffer. 
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Crystallization of Membrane-derived Cpase 


Diffraction-quality crystals of enzymatically-solubilized Cpase were obtained 
from three of the five batches (designated Cpase I - Cpase V respectively) of purified 
enzyme obtained from R. Charnas at Hoffmann - LaRoche in Basel, Switzerland. 
Although each batch varied in the solubility properties of the enzyme, in the conditions 
under which crystals were formed, and sometimes even in the crystal habit, all of the 
crystals for which diffraction information was obtained belonged to the same space group 


and had identical unit cell dimensions. 


Cpase I The first batch of Cpase to arrive yielded the best crystals. Cpase I which 
had a protein concentration of about 10 mg/ml arrived in 0.02M Tris pH 7.5 with 0.1 
molar sodium chloride. Crystals were grown by placing approximately 10 microliters of 
enzyme on a cover slip over a well containing Tris at pH 7.5 with 0.2 molar sodium 
chloride and 10% - 15% polyethylene glycol(PEG) (M.W. 8,000). Crystals formed 
within one week, but tended to dissolve if the temperature exceeded 85°F or if the 
solution in which they were formed was exposed to air so that any significant evaporation 
took place. The stability of these crystals increased significantly if they were held in 20% 
PEG 8000. A photograph of one group of these crystals appears in figure 23. The two 
largest crystals have a hollow core for at least half their length. This condition indicates 
much more rapid growth on two surfaces than on the third. The size of the largest crystal 
is approximately 2.0 mm x 0.33 mm x 0.17 mm. A solid part of one of the crystals in this 


photograph was used to take the X-ray precession photographs shown in figure 30 and 31. 


These photographs were used to determine the size of the unit cell and the space group 


(see Determination of Space Group). 
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Figure 23. Cpase I Crystals The drop contained 10 ul of 10 mg/ml protein solution in 0.1 M NaCl at pH 
7.5 with Tris buffer. The well solution was also at pH 7.5 with 0.2 M NaCl and 10% PEG 8000. These 
crystals grow to their full size in 5 days. The larger two have holes which extend about 1/2 of the their 
length. See Text. 

Cpase II Cpase II was not as pure as Cpase I. R. Charnas reported that it showed 
microhetergeneity on HPLC and had several minor bands in addition to the major band at 
45 KDa on SDS PAGE gel. The protein concentration was 11.2 mg/ml and it arrived in 
0.02 M N-Methyl morpholine acetate at pH 5.5. 

The first experiments with Cpase II involved trying to repeat the conditions which 
produced the good quality crystals obtained from Cpase I. Therefore only sodium chloride 
was added to the drops. When the initial salt concentration in the drop was less than 0.1 
M needles formed, but when the initial sodium chloride concentration was greater than 


0.1 M the protein tended to remain in solution. In some cases, sodium chloride crystals 


were formed in drops where the protein remained in solution. 
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The best crystals of Cpase II were obtained from drops which contained 4.5 
microliters of the Cpase II solution and 5.5 microliters of 0.02 M Tris buffer at pH 7.5 
containing 0.1 M NaCl, over a well containing 0.2 molar NaCl, 10% (W/V) PEG 8000 in 
Tris at pH 7.5. For several weeks no crystals appeared. It was however noticed that a 
surface had formed between the drop and the cover slip. At this time, the tip of a 
Hamliton syringe was dragged through these drops forming a furrow in this skin. 
Crystals then formed along this furrow, sometimes so rapidly that one could observe their 
formation while watching through a microscope. Pictures of crystals from two of these 
drops are shown in figures 24 and 25. The longest needle-shaped crystals were about 2 
millimeters in length. Figure 24 show some crystals with hexagonal shapes. Hexagonal 
crystal were not observed with Cpase I. These crystals were transported in sealed quartz 
capillaries to the laboratories of Hoffmann-LaRoche in Basel, Switzerland, where a 
summer of crystallization and X-ray diffraction studies was planned. However when a 
technician at the facility inserted the capillaries to a bit of clay for storage, the wax plugs 
were dislodged and the crystals desiccated, leaving them useless. 

Other crystals were grown from Cpase II while at Hoffmann-LaRoche under 
conditions similar to those described above. The hexagonal shape was not seen again, but 
other crystals of the long, slender variety were studied on the area detector. They 
diffracted, but only to resolutions of less than 5 A. One diffraction pattern was indexed 
and appeared to be of the same space group and to have the same unit cell dimensions as 


the crystal from Cpase I used to take the earlier X-ray photographs. 
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Figure 24. Cpase II Crystals Magnification 100X The drop contained 8 mg/ml protein, 0.1 M NaCl in Tris H 
buffer pH 7.5. Final well solution was 0.2 M NaCl, 10% PEG in Tris buffer pH 7.5. After serveral weeks i 
without the appearence of any crystals, a skin was noticed between the drop and the cover slip. The tip of a H 
Hamliton syringe was dragged through this skin forming a furrow. Crystal then formed along this furrow. | 
The far ends of the hexagonal crystals are out of focus because they extend away from the cover slip. | 
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Figure 25, Cpase II Crystals Magnification 100 X Growth conditions the same as described above. 
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Cpase II R. Charnas reported that Cpase III appeared homogenous on SDS 
PAGE, IEF and on FPLC. The protein arrived with a concentration of 8.0 mg/ml in 0.02 
molar sodium acetate buffer at pH 5.5. Unfortunately no crystals of any size were grown 
from this preparation. A drop containing approximately 0.1 molar NaCl, 8.0 mg/ml 
protein at pH 5.5 over a well containing 0.2 molar NaCl and 20 % PEG 8000 at pH 5.5 
lost considerable volume but remained clear, while a drop containing about 7 mg/ml of 
protein with 0.1 molar NaCl and 4% PEG at pH 5.5 over a well with 0.2 molar NaCl and 
20 % PEG 8000 at pH 5.5 formed many small needles and aggregates of these needles 
(see figure 26). Increasing the PEG concentration in the drop did not improve crystal 
formation and sometimes resulted in the liquid in the drop separating into two distinct 


phases. 


Figure 26. Aggregates of small crystals formed from Cpase III 
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Cpase IV Cpase IV was studied while at the laboratories of Hoffmann - LaRoche 
in Basel Switzerland. The protein, at a concentration of approximately 17 mg/ml, was 
dissolved in 0.05 molar Tris at pH 7.5 with 0.01 molar sodium chloride. The solution 
contained approximately 0.025% triton detergent which remained after the solubilization 
of bacterial membranes during the initial stages of enzyme isolation. Precipitation 
experiments showed that both salt and PEG were required in order to force Cpase from 
solution. 

The crystals of Cpase obtained thus far had two major problems. The first was 
their shape, which was very long in one dimension and very short in the other two. This 
asymmetry leads to technical problems with getting the bulk of the crystal into the X-ray 
beam. The second problem was the X-ray photographs of crystals of Cpase I indicated a 
very large asymmetric unit. Such a large asymmetric unit must contain several copies of 
the enzyme molecule. Solution of the structure of a protein is generally easier when there 
is only one molecule per asymmetric unit. Therefore a serious effort was made to obtain a 
different crystal form. This effort included attempts to replace the NaCl in the drop with 
another salt. The salts tested included: KF, CaCl2, (NH4)2SO4, MgSOu, Li2SOs, LiCl 
and MgCl2. While many of these salts also gave crystals, the shape of the crystals did 
not differ from those formed with NaCl. A wide range of pH was also tried. No crystals 
formed at low pH, while a pH of 7.5 seemed to produce the best results. It was possible to 
grow crystals in the presence of penicillin G, but again the shape of these crystals was the 
same as those grown without penicillin. Additions of octyl-B-D-glucopyranoside resulted 
in the formation of plates and sheets instead of three-dimensional crystals. Therefore 


efforts were concentrated on optimizing the concentrations of NaCl and PEG in the drops 


to give the best three dimensional crystals. Figure 27 is graphical representation of the 


results obtained for various crystallization conditions. 
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Figure 27. Graphic Representation of the Results of Crystallization Trials with Cpase IV 
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X-ray diffraction data were collected on two crystals of Cpase IV. The first 
crystal diffracted to about 5 A, while the second which was broken into three pieces 
diffracted to 3.2 A. One piece was used to collect a native data set, and another piece was 
soaked in K2PtCl4 to provided a heavy atom data set ( see Data Collection Using the Area 
Detector). The third piece was brought to the University of Connecticut, but was exposed 
to high temperatures as a result of a failure in the cooling system, after which it did not 


diffract. 


Cpase V Cpase V arrived in Storrs from Switzerland with a protein 
concentration of about 5mg/ml, in 0.01 M Tris ata pH of 7.5. The preparation contained 
no sodium chloride. Needles were produced from drops where the final concentrations 
were as follows: protein about 5 mg/ml, 15% PEG 8000, 0.2 M NaCl, 0.2 M Tris pH 7.5 
and 25% PEG 8000, 0.3 M NaCl, Tris pH 7.5. No crystals large enough to X-ray were 


obtained from this batch. 


Crystallization of Genetically Engineered Cpase 


Initially small aloquots ( 1 ul) of the purified, genetically-engineered enzyme were 
tested with various precipitants to determined which materials could force Cpase out of 
solution and the approximate concentration where this would happen. The following 
materials were tested PEG: 8000, PEG 4000, PEG 400, MPD, lithium sulfate, potassium 
tartrate, ammonium dihydrogen phosphate, potassium hydrogen phosphate, sodium 


acetate, sodium chloride and ammonium sulfate. The only salt which caused Cpase to 


precipitate was (NH4)2SO4. Cpase remained in solution in all other cases despite salt 


concentrations in excess of 1 molar. Cpase could be made to precipitate with 


polyethylene glycol of molecular weights 400 , 4,000 and 8,000. However, with PEG 
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8000 small needle crystals appeared along with the precipitate. Needles were not 
observed with any other precipitant. Precipitation was also observed with MPD. Since 
the hope was not to simply reproduce the previous crystallization results, but to produce a 
different crystal form of better quality, further experiments with PEG 400, PEG 4000, 
MPD and ammonium sulfate were undertaken. In these vapor diffusion experiments, 
small amounts of various salts were added to the protein drops and the pH was varied. 
Unfortunately none of them showed any indication of forming crystals, no needles or 
sheets were produced, only more precipitate. Attention was then turned to what could be 
done with PEG 8000. Table 1 lists the results of over 90 different trials with various 
concentrations of PEG, sodium chloride and glycerol at a number of different pHs and 
protein concentrations. A survey of these results indicates that glycerol was not helpful in 
crystal formation; that ata pH of 6.5 or lower, a phase separation occurred; that there are 
few crystals formed at pH 8.5 and that a fairly broad range of protein concentration 
appears to produce crystals. The effects of sodium chloride and PEG 8000 are not as 
straightforward. Apparently at least 12.5 % PEG 8000 is required for Cpase to come out 
of solution. No crystals were formed where there was not at least 0.1 M sodium chloride 
present. Better formed crystals appeared with 12.5 - 15 % PEG 8000, 0.2 - 0.4 M sodium 
chloride and a pH in the 6.8 to 7.5 range. Most of the crystals had the same general form 
(habit) as that for enzymatically solubilized Cpase, that is they were long and narrow or 
bow-shaped ( having narrow ends, a much thicker middle and with slightly a curved 
shape). The usefulness of a crystal for diffraction experiments is dependent both the size 
and the degree of order of the crystal. Ifa small crystal is a very good diffractor, it may 


last long enough in the X-ray beam to produce a data set. On the other hand, if it is 


Possible to grow very large crystals which diffract only moderately well, it may be 


possible to get a data set because high count rates can be obtained more rapidly, while a 


similar small crystal might be essentially useless. The size of the largest of these crystals 


approximately 1.5 mm x 0.05 mm x 0.05 mm, which puts them below the probable limit 
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for usefulness with the area detector. It may be possible to increase the size of these 
crystal by further experiments, but it is unlikely that crystals more than twice this size 
would be obtained using the current conditions which have already been optimized. 
Therefore, two of these crystals were X-rayed using the area detector. The first one was 
grown in a drop where the final concentrations were approximately 5 mg/ml protein, 0.2 
M NaCl and 12.5 % PEG 8,000 at pH 7.5 and was approximately 1.5 mm x 0.05 mm x 
0.05 mm or less. It showed a few diffraction intensities which indicated that much larger 
crystals of this type would be required in order to collect a data set using the area 
detector. A second crystal which was grown in a drop where the final concentrations 
were approximately 5 mg/ml protein, 15 % PEG, 4.0 M NaCl at pH 7.0 was 0.5 mm x 
0.05 mm x 0.05 mm or less. This crystal showed no diffraction at all, possibly due to its 
small size. A third crystal which measured 0.4 mm x 0.03 mm x 0.03 mm and which had 
an interesting shape, with both ends being square, broke apart as a result of tension on the 
crystal during liquid removal in preparation for X-ray analysis. Thus it was clear that in 
order to obtain data sets with the area detector, crystals of this type must be several times 


thicker than these were. 


Table I Cystalization Experiments with Engineered Cpase 
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Sample 


Dér 6-25-93 
D5u 7-20-93 
C4, 7-17-93 
C6, 7-17-93 
D4 7-20-93 
C6u 7-20-93 
C5 7-20-93 
Cé..6-25-93 
C2: 6-25-93 
D3 6-25-93 
B3; 7-17-93 
Bl 7-17-93 
B3u 7-20-93 
B6 7-20-93 
C6, T- 7-93 
D5. 7-12-93 
B5, 6-15-93-2 
B3 6-15-93-2 
B2: 7.-7-93 
C6. 7-12-93 
Bl 7-20-93 
Blu 7-20-93 
A4 7-7-93 
B3, 7-12-93 
C4u 7-20-93 
G3; 7-20-93 
A4 7-12-93 
B1 6-15-93 
Al 6-28-93 
A5 6-28-93 
DS. 8-2-93 
C2 8-2-93 
Bl 8-2-93 
D3 8-2-93 
B6 8-2-93 
A4 8-2-93 
C4 8-2-93 
B3 7-30-93 
C4 7-16-93 
A2 7-30-93 
C1 7-16-93 
A4 7-30-93 
B1 7-30-93 
B6 7-30-93 
B3 8-2-93 
C2 7-30-93 
A2 8-2-93 
A4 6-25-93 
D2 7-20-93 
B3 7-16-93 
D2u 7-20-93 
B6 7-16-93 
A5u 7-20-93 


on next page 
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% PEG MN 
15 0-2 6.8 
15 0.1 8.5 Pirie te ees 
15 0.1 95 gia x ine needles C4 7-7-93 
is 6.3% 75 7 : pe ve pepe D2 7-12-93 
15 0.1 7.5 i 5 Eire iroa A2 6-15-93-2 
15 o-i 75 i “4 nipa eedles A2 6-25-93 
15 0.1 7.5 i r riaa B1 6-25-93 
15 0.1 75 0.5 A4 6-15-93-2 
: 5 clear B3 6-25-93 
5 0.1 75 0.5 0 needles C4 7-12-93 
15 Une Teo 0.5 
0 xtal B1 7-7-93 
15 O 6.8 0.33 0 needles A2u 7-20-93 
ne 0.1 6.8 0.33 0 needles A2 7-20-93 
$ oie 6:75 0.5 0 separation B1 7-12-93 
3 be 6.5 0.5 0 separation A2 7-7-93 
= aa aoa te : year A2 7-12-93 
$ 9 ~ ine needles D5 6-15-93 
tee 1 0 fine needles D2 6-15-93 
12.5 0.3 8.5 0.33 0 ppt D3 8-2-93 
12.5 0.3 TSS 0.33 0 clear C2 8-2-93 
12.5 023 He 0.33 0 xtal B1 8-2-93 
r2.5 0.2 8. 0.33 0 ppt C6 8-2-93 
12295 OFZ T 0.33 0 xtal B5 8-2-93 
12.5 0.2 7 0.33 0 clear A4 8-2-93 
12.5 0.1 8.5 0.33 0 fine needles C4 8-2-93 
1245 041 7.5 0.33 0 fine needles B3 8-2-93 
12.5 a heist 7.0 0.33 0 fine needles A2 8-2-93 
12.5 0 TS 1 0 fine needles C4 6-15-93 
$275 0 725 1 0 clear C6 6-15-93 
10 0.2 7.5 f 0 clear D5 6-1-93 
10 0.2 745 1 0 clear D3 6-1-93 
10 0.2 7.5 1 0 clear B3 6-1-93 
10 rae 7.5 1 0 clear B1 6-1-93 
10 0 Toe 1 0 clear A2 6-15-93 
10 0 7.5 1 0 clear A4 6-15-93 
5 0.2 7.5 1 0 clear C4 6-1-93 
5 052 7.5 1 0 clear C2 6-1-93 
5 ocr 7.5 1 0 clear A2 6-1-93 
5 Ost 7.5 1 0 clear A4 6-1-93 


These crystallization experients were carried out by 


above indicates the final concentrations of the following materials in 
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purified enzyme), Gly - the volume/vo 
experiment (xtal means that a crys 
refers to a phase separation, and ppt 


EG 8000, M Nai 
that the final protein co 
lume percent of glycerol in 
tal with at least some isi 
indicates the formation of precipitate). 


the hanging 


drop method (see text). The heading given 
the drop at the end of the experiment: % 
of NaCl, pH , Conc. E - the relative 

1/3 that of the concentrated 

the drop, Result - the result of the 

nsional shape was formed, separation 


48 


Collection and Analysis of X-ray Data 


When light is shone through a fine grating, a diffraction pattern is formed as the 
light rays emerge through the grating as wave fronts. When the fronts collide they 
reinforce each other at some points and cancel each other at other points, thus forming a 
diffraction pattern. Ifa lens is properly placed in the path of the diffracted rays, an image 
of the original grating will be produced. There is an interesting relationship here in that 
an object and its diffraction pattern are Fourier transforms of each other. In other words 
the diffraction pattern of a object is the Fourier transform of the image of that object, 
while a Fourier transform of the diffraction pattern gives the original image. This is 
analogous to the diffraction of X-rays by electrons with some important differences. 
First, there are no lenes which are able to focus X-rays, so that while the formation of an 
X-ray diffraction pattern is straight forward, the formation of an image from that pattern 
is not. Secondly, no instruments are sensitive enough to record diffraction from a single 
protein molecule. This second problem is overcome by using large single crystals of the 
enzyme. For a general reference on X-ray crystallography see Azaroff (1968). More 
specific information on protein crystallography may be found in Blundell and Johnson 
(1976) and McRee (1993). 

A crystal is a three-dimensional array of molecules, millions of molecules in the 
d in X-ray diffraction studies. These crystals have electron 


case of the large crystals use 


density only at certain repeating points, where the atoms which form molecules are 


located. The crystal itself acts as a diffraction grating producing a pattern as a result of 


this regular spacing of molecules within the crystal. The diffraction pattern for a protein 


Crystal is a convolution of the diffraction pattern for the molecule and that formed by the 
periodic array of molecules within the crystal. 


Molecules are able to crystallize in only a limited number of patterns. This is 


because a crystal is a solid object and the unit cells which form the crystal must be able to 
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stack together like so many boxes of the same size without spaces between them. These 
patterns (space groups) are enumerated in the International Tables for Crystallography 
(1988). Because protein molecules have a unique handedness, only those space groups 
which do not require a mirror image are open to them. 

A unit cell is described by the lengths of three axes (a, b, c) and the angles 
between them (a, B, y ). There may be a single molecule within the unit cell or several 
molecules which are related by symmetry operations. Among these symmetry operations 


are a simple two fold rotation and a two fold screw rotation (both are illustrated in figure 


Length 
J L a” ai ‘ald 


RZ 


28). 


Two fold rotation: Two fold screw rotation: 


tate object about an axis by 180° rotate object about an axis by 180° 
m . and translate along that axis by 1/2 
the length of the unit cell. 


Figure 28. Two fold and two fold screw rotations. 


In some crystals, symmetry operations are performed on a single protein molecule; in 


others, on a group of molecules. In the latter case, the molecules in this group may be 


related by a symmetry different from that of the larger crystal (non-crystallographic 


symmetry). 


This orderly arrangement of molecules within the crystal produces surfaces at regular 


intervals which reflect X-rays. Bragg's law is given by 
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2dsinO0= nÀ a) 


where 0 is the angle of the incident X-ray beam, À is the wavelength of the X-rays and d 
is the distance between two reflecting planes. In the condition where n is any integer, 
there will be constructive interference and refections will occur which can be recorded on 


photographic film (figure 29). 


Reflected 
Rays 


Figure 29. Illustration of Bragg's Law. Incident X-rays are in phase. Diffracted X-rays will be in phase ii 
where the difference in their path lengths is equal to nA (where n is any integer and 2 is the wavelength Ma va 
X-rays). The path lenght difference here is equal to ABC or 2AB. AB equal d sin 9. Therefore there 


constructive interference where nd = 2 d sin 8. 


The points of constructive interference so obtained will form a reciprocal lattice in which 
the distance (d*) between lattice points is proportional to the reciprocal of the distance (d) 


between the planes within the crystal. Taking into account the magnification which 


occurs because the film is located at a distance from the crystal and assuming that the 


crystal is orthorombic ( œ, B, y are right angles), the following equation c 


s of reflection within a crystal from a precession 


an be used to 


determine the distance between the plane 


image recorded on photographic film: 


(2) 
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where d is the distance between crystal planes, F is the crystal to film distance, À is the 
wavelength of the X-rays and dr is the distance between spots on the film. 

The actual diffraction pattern for a crystal is a convolution between the transform 
of the molecule and the transform of the lattice magnified by the distance between the 
crystal and the device used to record the diffraction image. In other words the diffraction 
pattern is the molecular transform sampled only at the points of the reciprocal lattice. 

The reflections which form a diffraction pattern are identified by three Miller 
indices ( h, k, 1). These indices indicate the relative location of the planes within the 
crystal which give rise to a particular reflection. For example, a plane which is parallel to 
both the b and c axes and which cuts the a axis at the length of the unit cell is called 
(1,0,0) (h= 1, k= 0, and /= 0), while a plane which is parallel to b and c but cuts the a 
axis at one half the unit cell is (2,0,0). A plane which cuts both a and b axes at 1/4 unit 
cell but is parallel to c is called (4,4,0) and so on. 

Diffracted X-rays like other types of electromagnetic radiation have a wave 
nature and can therefore be described by amplitude and frequency. A wave arriving at any 
point can be described by its amplitude and the phase of the wave at the time of arrival. 
F(Ak/), known as the structure factor, represents a diffracted X-ray. It is a complex 
number which includes a real part - the amplitude, F(hk/), and an imaginary part - the 


phase, e-2*/(#x + ky + 1z), X-rays diffracted from a crystal can be recorded on 


photographic film or electronically. These spots are intensities which are related to the 


structure factor by the following: 


I(hkl) = FF* (3) 


where F* is the complex conjugate of F. The darker the spot on a photographic film, the 


i intensity, the 
higher the intensity of the reflection. By taking the square root of the intensity 
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amplitude F(A&/) can be obtained, but not the phase of the structure factors. One of the 
primary problems of X-ray diffraction is that of recovering the lost phase information. 
Several methods for doing this are discussed below. Once the phases of the structure 
factors have been determined, the production of an electron density map is fairly straight 


forward. An expression for the electron density of a crystal is given below 
p(xyz) = bY È F(hkl)e 27% + ky+ Iz) (4) 


where p(xyz) is the electron density at any point x,y,z, and h,k, are the Miller indices as 
described above over which the series is summed. 

Conversely, it is possible to calculate the structure factor for a particular h,k, once 
the locations of the atoms within a structure is known. In this case a Fourier transform of 


the summed electron density is taken as in equation (5) given below. 
F(h,k, D =X p(x,y,z) e27"* * FY * 1?) (5) 
Determination of Space Group 


Figures 30 and 31 show precession photographs of the Cpase I crystal shown in 


figure 23. Both precession photographs have been aligned to record only zero order 


reflections (refections where at least one of the h, kor /indices is 0). The crystal to film 


distance was 100 mm and the precession angle was 8°. After these two precession 


photographs, this crystal had lost most of its ability to diffract X-rays, so that it was not 


Possible to obtain upper level photographs. 


In figure 30 the incident beam of X-rays was oriented at an edge of the crystal 


rather than at a crystal face and the film was exposed for 10 hours. The rotating anode X- 
Tay generator was operated at 280 milliamperes and 40 kilovolts. This image of the (0, k, 
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1) plane shows mirror image symmetry both top to bottom and right to left. There is a 
systematic absence of every other reflection along the ( 0, k,0 ) axis which is not seen 
along the ( 0,0, Z ) axis. In addition the angle between the k and / axes is 90°. The length 
of the b axis was found to be 140 A while the c axis was 164 A (see equation 2). 

The crystal was rotated 90° around the c axis (horizontal) in order to obtain the 
photograph shown in figure 31, which shows the (h,0,/) plane. There is mirror image 
symmetry top to bottom and right to left and the / and / axes intersect at right angles. In 
this case a systematic absence of every other refection occurs along the ( /,0,0 ) axis. The 
a axis was found to be 82 Å (see equation 2). 

The symmetry seen in these precession photographs can be explained in two ways. 
It can be Laue class mmm, (three mutually perpendicular mirror planes) which would 
indicate an orthorhombic crystal system. Or it could be Laue class 2/m (a two fold 
rotation with a mirror plane perpendicular to the axis of rotation ) which would indicate a 
monoclinic crystal system. These two Laue classes will have the same appearance on zero 
order precession photographs, so that upper level precession photographs will be 
necessary to distinguish between the two of them. Thus this crystal may be orthorhombic 
or monoclinic in the special case where the unique angle ($) is 90°. 

The systematic absences on (h,0,0) and (0,k,0) indicate the presence of screw axes 


(see figure 28) parallel to the a and b axes while the lack of these absences on (0,0,/) 


indicates a simple two fold axis parallel to the c axis. It would have been safer to assume 


ted as though the crystal was in the 
Tables of Crystallography. 
the data 


the lower 2/m symmetry but, the data were trea 
orthorhombic space group 18 (P21212) of the International 


The program Xengen (Howard ef al., 1987) which was later used to process 


collected electronically, was able to index, that is, to assign h,k,l values to the reflections 


obtained, by assuming that the space group was P21 212. 
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0,k,1) 
This is an image of the 

camera. 

viene 

Figure 30, X-ray diffraction photograph 


them is 90°. See text 
The angles between 

i rocal space c) is horizontal, while b" is vertical. 

plane. c* (recip 


for details 
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i is is ani of the ( A, 0,/) 
Figure 31, X-ray diffraction photograph taken with a precession camem. This oe a 
plane. c” (reciprocal space c) is horizontal, while a is vertical. The angle between 
details 


a 


eae 
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Because of the large volume of this unit cell, an asymmetric unit certainly contains 
more than one protein molecule. Based on other known protein structures one would 
expect a roughly globular protein with a molecular weight of 40,000 daltons to have a 
diameter of 50 to 75 A. Matthews (1968) surveyed 116 globular proteins and found that 
the ratio of the volume of the asymmetric unit to the mass of protein within that unit (Vm) 
ranged between 1.68Å° /dalton and 3.53A;3 /dalton for all proteins tested. For proteins 
with a molecular weight between 40,000 and 50,000 daltons, the range was narrower 
(about 1.8Å° /dalton to 3.0A3 /dalton). Since the volume of the Cpase asymmetric unit is 
4.71 x 105 A? and the molecular weight of the protein without the hydrophobic carboxy- 
terminus is 43,000 daltons, Vm values between 1.8Å° /dalton and 3.0A° /dalton can be 
calculated by assuming that there are between 4 and 6 molecules in the asymmetric unit. 
While the results of this simple test are not conclusive, it certainly suggests that there is 


more than one molecule in the asymmetric unit and more likely four or more. 
Collection of Data Using the Area Detector 


Once the space group of the Cpase crystals was know, diffraction data were 
collected for as many different h,k,/ reflections as possible before the crystal stopped 
diffracting due to radiation damage. This was done by placing the crystal in the X-ray 


beam and then rotating the crystal by a small fraction of a degree every one to two 


minutes. Diffracted X-rays from each exposure (called a frame) were collected by a 


i i-wi ich allows 
Nicolet (Siemens) X-200 three axis ( ©, X, 9 ) multi-wire area detector, which allo 


two-dimensional X-ray diffraction data to be recorded electronically rather than on 


Photographic film. 


A crystal from Cpase IV which grew in a vapor diffusion drop with a final 


i K 8000 at pH 7.5 
Concentration of approximately 12 mg/ml protein, 0.25 M NaCl, 15% PEG p 
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in Tris buffer, was cleaved to three pieces. One of these pieces was used to collect a 
native data set and while another was used for a heavy atom data set. The piece of the 
crystal used for the heavy atom data was first soaked in a solution which contained 0.1 
mg/ml K2PtCl4 and 20 % PEG 8000 in HEPES buffer at pH 7.5 (see section on 
isomorphous replacement). For both the native crystal and the heavy atom crystal 1,000 
frames were collected over the course of three days. The frames were separated from 
each other by a rotation of 0.15 degrees and were exposed for 180 seconds. The crystal to 
detector distance was 16 centimeters, while the 2 6 angle for the detector was fixed at 
10.0 degrees. During data collection, the crystal was cooled to 4°C. with a flow of cooled 
nitrogen gas. As crystals are exposed to X-radiation, small amounts of disorder occur 
within the crystal which cause a decrease in their ability to diffract. Cooling is often 
helpful in retarding this damage. 

The collected data were processed on a VAX computer system using Xengen 
software (Howard et al., 1987). This software allows the user to refine the unit cell 
parameters ( the length of a, b, c and the angles a, B, y ) as well as other parameters 
characteristic of the collection system such as the crystal to detector distance. The 


program then finds the center of the intensity of each reflection and allows the user to 


discard poorly behaved reflections, such as those which appeared to be twinned or which 


occur at the edges of a frame. Once the location of all reflections has been established, 
they are assigned A, k, / indices using information on the space group provided by the user 


and the refined cell parameters. Some reflections with a particular A, k, / are observed 


more than once and the information from these multiple observations Is merged. 


on damage, diffraction at the end of data collection is not as 


Because of radiati 


intense as that at the beginning. In addition, certain aspects of the detector system may 


re intense than 
cause reflections at some locations on the area detector face to appear mo 


others. For these reasons the data are scaled so that the average intensity of the reflections 


collection period. Finally the user is able to obtain a set 


remains the same for the entire 
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of structure factor amplitudes ( F's ) sorted by h, k,/ as well as a set of statistics 
describing the data set. Table II gives the statistics for the native and heavy atom Cpas 
e 


data sets. 


Table II Statistics on Collected Data 


Observations 79,711 


Unique Reflections 26,156 
(Bijvoets merged) ‘ 


Completeness 76% 


Resolution Range 20-3.18A 


Average I/ol 5.70 


10.07 


R= my Ja -WI mean) ‘1a i= index of observation 
ij V Onn j= index of reflection 
>» Ij |? I = intensity 
as (o; oi; = standard deviation of [i 
w =scaling-function value that scales observation i of 


reflection j to other observations thereof 
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Analysis of Data Collected with the Area Detector 


At this point there again arose a difficult problem common in protein 
crystallography, that is, a lack of crystals of sufficient quality to be useful. As was 
discussed elsewhere, each batch of Cpase from bacterial origins had different solubility 
properties so that a great deal of experimentation was necessary to find good 
crystallization conditions each time a new batch of enzyme became available. Cpase IV 
crystals which diffracted to 3.2 A did not appear until late in my stay in Basel, 
Switzerland, so there was not sufficient time to repeat these conditions in Basel. Other 
Cpase IV crystals were X-rayed in Basel, but none diffracted to a resolution greater than 
5A. Cpase IV crystals brought from Basel to Storrs were exposed to heat in excess to 
85°F after which they did not diffract at all. Since no more crystals were available, an 
attempt was made to obtain a structure of Cpase using only the two sets of data collected 
in Basel. While it is theoretically possible to produce a structure from a single set of 
native data by molecular replacement or from a native data set and one set of heavy atom 
data using isomorphous replacement, the chances of success are not high unless the data 


are very good. In this case, the data were not ideal. There were two major weaknesses. 
First, the diffraction limit of the crystals is only about 3.2 A. At 3.2 A, the general 
s can be difficult to interpret. 


thin 


features of molecular structure are discernible, but the detail 


The second and more difficult problem was the presence of 4 or more molecules wi 


the asymmetric unit. This means that the structure of these four molecules must be solved 


as a unit, converting the problem from the structure determination of a 43,000 dalton 


protein to one of at least 172,000 daltons. It is true that the presence of four molecules in 


i ic unit has 
the asymmetric unit means that once an electron density map of the asymmetric uni 


| 
been determined, the densities of these four molecules can be averaged so that the fina 


i not be used until an 
result is a clearer map of a single molecule. This benefit however can 
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initial solution to the phase problem has been found. There are other important problems 


associated with this condition which are discussed below. 
Molecular Replacement 


The first approach to solving the phase problem for Cpase was to use molecular 
replacement which requires that there exist a structurally similar protein with a well- 
defined tertiary structure which is used as a model. The model protein is rotated and 
translated and then structure factors are calculated for it. The model structure factor 
amplitudes are then compared to the measured amplitudes of the unknown. This is 
repeated until one reaches the best correspondence between the model amplitudes and 
those of the unknown. This allows every amplitude for the unknown structure to be 
associated with a calculated phase from the known structure. A preliminary electron 


density map is then calculated using these data. 


Choice of a Model 


The first step in molecular replacement is to choose an appropriate model, that is 


imi i f the 
one you expect to have a similar structure to your unknown. Cpase is a member o 


penicillin interacting proteins, a large group of enzymes which function by the same 


). A number of these proteins 


dase- 


mechanism and have similar active sites ( see Introduction 


have been extensively studied including a water soluble DD-carboxypepti 


transpeptidase excreted by Streptomyces strain R-61 (called R-61) and the water soluble 
(called B.L.). The high resolution 


B-lactamase secreted by Bacillus licheniformis 
. , Knox & 
structures of both R-61 (Kelly ef al., 1985, 1994) and B.L. ( Moews et.al., 1990, Knox 


: i i ino acid 
Moews, 1991) were determined at the University of Connecticut. While the amino 
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sequences of Cpase, B.L. and R-61 show little homology (see figures 33 and 35), the 
structures of B.L. and R-61 show many features in common (see Introduction). Therefore 
it is reasonable to suppose that Cpase will have features in common with B.L. and R-61 
and that one or both of these enzymes might serve as an appropriate model for molecular 
replacement with Cpase. 

It is a well established fact that both the secondary and the tertiary structures of a 
protein are highly dependent upon the hydrogen bonding, hydrophobic, ionic and other 
characteristics of the residues that form a particular sequence. There have been numerous 
attempts to use sequence information to predict secondary structure. Thus far, these 
methods have meet with some success; however none of them accurately predicts in all 
cases. 

Kyte and Doolittle (1982) used average hydropathy to predict whether a section of 
protein is likely to be located on the surface or in the interior of a molecule. The method 
involves assigning each amino acid a hydropathy, averaging these hydropathies over an 


odd number of residues (for example 13) and then assigning that hydropathy to the 


central amino acid, in this case residue number 7. A plot can then be made of amino acid 


number vs. average hydropathy. Such plots were used to compare Cpase with both R-61 


and B.L. This was done by first tethering the active serines at position 36 in Cpase and 


position 62 in R-61 or position 52 in B.L. (note this is same residue as serine 70 in the 
consensus numbering system for B-lactamase see Amber, 1980) and then superimposing 


the plots of the average hydropathy from Cpase on that for R-61 or B.L. (see figure 32 


and 34). A limited number of breaks were introduced into the sequences in order to 


improve the fit. 


An examination of figure 32 shows a good correlation between the average 
. - ce, 
hydropathy of Cpase and B.L. with the introduction of two breaks in the B.L. sequen 
iti rs 10 residues 
the first covers 15 residues starting at position 100 and the second cove 


rs that 
staring at position 185 of the B.L. sequence (see figure 33). Itappea 
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approximately the first 280 of the 421 residues of Cpase correspond to the entire B.L. 
protein which is only 271 residues in length. This lead to the idea that Cpase may consist 
of two domains, one which has enzymatic activity and another which acts as an arm that 
would extend from the membrane to position the enzymatic domain. This idea is 
consistent with the fact that the 25 or so carboxy-terminal residues anchor Cpase to the 
cell membrane (Waxman & Strominger 1981). Since B.L. is a water-soluble enzyme, it 
would not be expected to have a membrane-anchoring domain. 

The Cpase sequence aligns with the 349 residue R-61 sequence in a similar 
fashion, but not as well as against B.L.(see figure 34). In this case two breaks must also 
be introduced into the Cpase sequence, the first covers 35 residues starting at position 93 
and the second covers 12 residues starting at position 158 of the Cpase sequence (see 
figure 35). Even though R-61 is 349 residues long, only about 280 residues from Cpase 
can be fitted along its sequence. The remaining residues are again at the carboxy- 
terminus. R-61 also does not contain a positioning domain as it is a water-soluble 
enzyme. 

It is perhaps somewhat surprising that Cpase should show a greater likeness to 
B.L. than to R-61, since R-61 and Cpase are both carboxypeptidases, while B.L. isa B- 
lactamase. Perhaps the source of the enzyme is more important in this case. Both the 
B.L. and Cpase enzymes are produced by members of the genus Bacillus, while R-61 is 


secreted by Streptomyces. 
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Figure 33. Comparison of the amino acid sequences of Cpase and BL. Cpase sequence is given first and labeled "C". The BL ores is below that for Cpase 


and is labeled "BL" The alignment of these sequences follows directly from the plots of average hydropathy shown in figure 32 
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Figure 34. Average Hydropathy R-61 vs C 
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2 3 4 5 b 7 
ADLPAPDDTGL GAVLHTAL SOGAPGAMVRVDDNGTIHAL SEGVADRATGRAITTTDRFRVGSVTKSFSAVV R 


en ee ee ESAPL DIRADAAILVDAGTGRILYEKNIDTVLGIASMTKMMIEYL C 


8 9 10 11 12 13 14 
LLALVDEGKLDLDASVNTYLPGLLPDDRITVROAVMSHRSGLYDYTNDMFAQTVPGFESVRNKVFSYQDLIT R 


38 : : E 
LLDAIKAKRVKWDOMYTPSDYVYRLSADRALSNVPLRKDGKYTVRDVY----------------------- 


15 16 17 18 19 20 21 

LSLKHGVTNAPGAAYSYSNTNFVVAGMLIEKLTGHSVATEYONRIFTPLNLTDTFYVHPDTVIPGTHANG 
s 

= = S EAMALYSANGATVAIAELIAGSEKNFVKMMNDKAKELGLKDYKFVNATGLSNKDLKGF 
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Figure 35. Comparison of the amino acid sequence of Cpase and R-61. The R-61 sequence is give first and labeled "R". The Cpase sequence is below that for R- 
61 and is labeled "C". The alignment of these sequences follows directly from the plots of average hydropathy shown in figure 34. 
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MERLOT 


When using molecular replacement one tries to rotate and translate the model of a 
known structure into the space of an unknown structure. In the case of Cpase, molecular 
replacement was carried out using a package of computer programs called MERLOT (P. 
M. D. Fitzgerald, 1988). The programs in this package are integrated so that output from 
one program can easily be input to the next, but each program stands alone. The 
evaluation of the results of individual programs is not trivial. The basic equation of 


molecular replacement can be represented as follows: 
X2 =[C]Xi +d (6) 


where X; is the position of the model, X2 is the position of the unknown molecule, [C] 


represents the rotation matrix of the model molecule which will give it the same 


orientation as the unknown molecule and d is the translation matrix of the model which 


will place it in the same space in the unit cell as the unknown molecule. 


Because Cpase crystals contain 4 asymmetric units which are related by symmetry 


operations, it is necessary to solve equation 2 for only a single asymmetric unit or 1/4 of 


the unit cell. However, equation 2 must be solved for each molecule within the 


asymmetric unit. Since Cpase has at least 4 molecules in the asymmetric unit, there will 


be at least 4 solutions. This means looking for 4 weaker peaks rather than a single strong 


peak among the background noise. 


Usually [C], the rotation matrix, can be found with more certainty than the 


translation matrix. Therefore, this part of the process 1$ d 
ted as possible orientations of B.L. wi 


one first. Initially a large 


/ th respect to Cpase. 
number of rotations were indica 
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An attempt was made to distinguish between them by using two different resolution 
ranges, 7 A-4A and 6.5 Å -3.5 Å. Among the many peaks which appeared in the 
outputs, there were six reasonably strong ones which appeared at both resolutions. The 
four strongest of these peaks were chosen as the most likely molecular rotations and 
entered into the next group of programs which generates maps similar to Patterson maps 
(discussed below), which may be solved for translations in the x, y and z directions (see 


Table III). The most likely translations for the top 3 rotations were selected. 


Table II Rotations and Translations from MERLOT 


0.420 0.316 


0.387 0.463 
0.244 0.500 


both the 4 - 7 A and the 3.5 - 6.5 A resolution data 


i rank order of the peak at a given 
iL ees are given for the first three 


The six rotations which appeared using 
are given above. The columns labeled * g 
resolution, with the strongest peak labeled 1. 
rotations. See text. 


f translations and rotations were applied to the B.L model. The 


Next these sets 0 ! 
then placed within the Cpase unit cell and displayed 


resulting molecular orientations were i 
(Moews, P. and Moews, D. unpublished). 


on a graphics terminal using a program PACK 
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The packing of the molecules within the asymmetric unit seemed reasonable; that is, the 
molecules did not interpenetrate each other and filled the space more or less evenly. 

Finally these rotations and translations were refined using a MERLOT program 
which minimized the residual, R, as described below. 

R = ( 
2 | Fo(h) | 

where, Fo is the observed amplitude for Cpase at every h,k,/ and Fe is the calculated 
amplitude for the B.L. after applying the newly determined rotations and translations. 
The program works by slightly varying the rotation angles or the translations and then 
recalculating R. As the computed rotations and translations get closer to the actual 
molecular location, the value of R decreases, and eventually reaches a minimum as the 
calculated solution closely approaches the true location of the unknown molecule. 


Unfortunately, no significant decreases in R were obtained. This is strong evidence that 


the solutions found are not correct. 


Isomorphous Replacement 


The second approach to solving the phase problem for Cpase was to use 


isomorphous replacement. Two data sets are said to be isomorphous if the crystals used to 


obtain them are of the same protein, space group, have the same unit cell and if the 
differences in their diffraction patterns are relatively small. If one of these data sets 


contains a heavy atom, while the other does not, it may be possible to use the difference 
ithi it cell. The 
between these data sets to locate the heavy atom or atoms within the unit ce 


location of the heavy atom can then be used to determine the phases for the native data. 


Diffraction intensity (I) is equal to F multiplied by its complex conjugate. A 


PEM. n Function. Because the 
Fourier transform of diffracted intensities 1S called the Patterso 
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transform of (A convoluted with B) is equal to the (transform of A) convoluted with the 
(transform of B), the Patterson function is also equal to the (transform of the structure) 
convoluted with the (transform of the inverse of the structure). In this case inverse means 
that the each point in the original structure is replaced by one that is the same distance 
from the origin but oriented in exactly the Opposite direction. Thus a Patterson map is the 
convolution of the structure with the inverse of that structure or, in other words, a map 
which shows not the atomic positions but the interatomic vectors between all atoms (see 
figure 36). For a molecule containing only a few atoms, it is often possible to deduce the 
structure of the molecule from a careful examination of a Patterson map. However 
because of the large number of atoms present, this is not possible for a protein molecule. 
But if there are only a small number of heavy atoms bound to the protein molecule, the 


locations of these heavy atoms can often be determined using a difference Patterson map. 


y y 
Cc 
BC AC 
origin 
A B ees 
x 
F BA AB 
CA CB 
B. 
A. 


Patterson vectors generated by that structure. The origin 


Figure 36. A shows a structure, while B shows the a therefore be the strongest peak on the map. 


on B contains the self vectors (AA, BB and CC) and 
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appear at -2X, -2 on the plane where W is equal to zero. In a similar fashion itis 
possible to show peaks corresponding to that heavy atom vectors will appear at the 
following locations on the following planes: 
onU = 1/2 
at 2Y - 1/2); H22 noun A 
-2Y + 1/2, -2Z, 2Y + 1/2, -2Z | 
onV = 1/2 
at 2X-1/2, 2Z, | -2X-1/2, 2Z 
-2X + 1/2, -2Z, 2X + 1/2, -2Z 
Peaks which represent heavy atom vectors should appear on all three planes at the 


positions indicated and be consistent with each other. 
Production of Difference Pattersons 


Difference Patterson maps were produce using a series of programs, XTAL 
(Stewart & Hall, 1988). The BFOURR program scaled the native and heavy atom data 
sets and then found the differences between them. This was followed by the FOURR 
program which squared the delta Fs and then calculated a difference Patterson. Peaks 
appearing on the Harker planes for space group P21 212 (W equal zero, V equal 1/2 and U 
equal 1/2) were examined. There were no strong peaks, but there were several weak 


ones. It is often difficult to distinguish between weak peaks and background noise. A 


search was made for x, y, z locations which appeared on all three planes. Two such 


locations were found at (0,023, 0.17, 0.14) and at (0.11, 0.083, 0.22) given as fractional 


Coordinates in the unit cell. 

These possible locations for the platinum heavy atom (see Collection of pe 
Using the Area Detector) were refined along with temperature factors and _——— 
using the CRYLSQ program in XTAL. Although several attempts were made changing 
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the order in which various factors were refined, the residuals failed to drop and thus there 
was no evidence that these solutions are valid. 

Questions naturally arose at this point about whether or not the data were being 
handled correctly in XTAL. XTAL was at that time a fairly new system at the University 
of Connecticut, although it had been widely used elsewhere. XTAL provides automated 
data handling in which the user does not examine the data at various stages of the 
processing. X-ray System of Crystallographic Programs (Stewart et al., 1972) on the 
other hand had been used successfully many times at the University of Connecticut and 
allows the user to readily examine the data at each stage of processing. Therefore the data 
were reprocessed using X-ray System. Scale factors were recalculated, the data were 
examined at each step and a new difference Patterson was calculated. An examination of 
the planes W equal zero, V equal 1/2 and x U equal 1/2 did not reveal any strong peaks. 
As before, the positions of weakly indicated atoms could not be refined, indicating that 
they are not actual heavy atom locations. 

At least part of the problem in determining the phases for Cpase is the large 
number of molecules in the asymmetric unit. If there are 4 molecules in the asymmetric 
unit, then there are 16 molecules in the unit cell. If each molecule has one heavy atom, 
there will be 16? or 256 peaks on a difference Patterson. Sixteen of the these peaks will 
be superimposed upon the origin as self vectors. The remaining 240 would be distributed 


through out the unit cell. Because of the symmetry of the Cpase space group only 1/4 of 


the Patterson space needs to be examined. Sixty peaks would be distributed though out 


this asymmetric unit with 4 occurring on each of the three planes described above. It is 


i i . If two hea 
not uncommon for more than one heavy atom to bind per protein molecule If vy 


ituati lex with 32? or 
atoms bind to each protein molecule, the situation becomes really comp 
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one or perhaps two platinum atoms bound per molecule of protein with high occupancy 


that is, with almost all protein molecules binding these heavy atoms in the same positions. 


Direct Methods 


Heavy atoms can sometimes be located by direct methods. If successful, the 
relative phases of the structure factors are determined. If the structure is 
centrosymmetric, then all of the phases must be either 0° or 180 ° and the situation is very 
much simplified. Once the phases of the structure factors are determined, an electron 
density map is produced by taking the transform of the complex structure factors 
(containing both phase and amplitude information). 

SHELXS-86 (Sheldrick, 1990) is a system of computer programs for applying 
direct methods to the phase problem for X-ray diffraction data. While it is used primarily 
for small molecules, it can also be used to locate heavy atoms within protein molecules. 

In this case the direct methods are applied to delta Fs (the absolute value of Fp x - Fp). 
SHELXS and other such programs use a statistical approach to solving the phase problem. 
The basic idea is that phase information can be derived directly from structure factor 
amplitudes (F). They make use of normalized structure factors (E). E values can be 
derived from observed Fs by a process in which the sum of (E?-1)? over all reflections is 


minimized. Structure invariants such as the one described below are then examined: 


(h) = o(h') + o(h-h’) 
or 


(8) 
$h) + o(h') + o(h-h') = ® = 0 


where (h) is the phase of a particular reflection. 
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As the magnitudes of the reflections associated with these phases increases, the 
probability that equation (9) is correct increases. This information is combined with any 
restrictions on the phases which occur as a result of the space group and choice of origin. 

Based on these fixed phases, others are estimated and an evaluation of the 
correctness of the estimated phases is performed. In practice, large numbers of phases are 
estimated, while only those with high figures of merit are retained. The process is 
iterative and continues until there is no longer any improvement in the correctness of the 
estimated phases. By this method, A. Kuzin of our laboratory located twenty possible 
heavy atom locations within the Cpase unit cell. 

Dr. Kuzin then used the program PHASES (Furey & Swaminathan, 1990) to 
refine these heavy atom locations. A difference Fourier map was then generated again 
using the program PHASES, and additional heavy atom sites were located and refined. In 


the end, 38 heavy atom sites were identified. 


Calculation of Structure Factors 


Once heavy atom positions are located, the structure factors are calculated for the 
heavy atom for each h,k,/ (see equation 5). These structure factors (Fu) contain both 


i ; : ere 
amplitude and phase information, while the structure factor amplitudes (F) derived fr 


i i ion. Each 
intensities recorded by area detectors or on film do not contain phase informatio 


native data (Fp), an F for the heavy 
(Fu). The 


h,k,l now has three F's associated with it, an F for the 
lone 

atom protein complex data set (Fp n) and an F for the heavy atom a 

following equation shows the relationship between these quantities: 


(9) 
Fp + Fu = Feu 
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where Fp is the structure factor, a vector, for the native data containing both amplitude 
and phase, and Fp n is the structure factor for the heavy atom complex data also 
containing both amplitude and phase information. A diagram such as that shown in figure 
37 can be used to estimate protein phases for every h,k,/ in the native data set using the 


observed Fp and Fp data along with the calculated Fy data. 


Figure 37. Vector Diagram for determining phase angles. Since both the phase and the amplitude are 
known only for F» , this is the only one of the three vectors for which both the magnitude and direction are 
known. This vector is drawn first. Next circles with radii representing the magnitude of Fp» and Fp are 
drawn with their origins at either end of the Fa vector. Possible correct phase angles for Fp occur when the 


three vectors Fpa, Fy, Fp form a closed triangle. As can be seen there are two possible phase solutions for 
Fp. Using additional heavy atom data sets will resolve the problem. 


As can be seen from this diagram unless Fp and Fp n are collinear, that is they have the 


same phase angle so that the circles will intersect only once, a single set of heavy atom 


data will not determine a unique phase for an k,h,/. In practice, several sets of data using 


different heavy atoms are usually collected to better determine the correct phase. There 


are methods to enhance phases that were determined using only one heavy atom 


derivative such as solvent flattening, but these were not employed with Cpase. 


The phases for as many Cpase reflections as possible were calculated using the 


i i ttering (see table 
isomorphous data and all 38 heavy atoms sites as well as anomalous scattering ( 


IV). 
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Table IV Phasing Power and Figure of Merit vs Mean D Spacing 


Dg > 2 ER 
Overall mean figure of merit for 11,807 reflections to 3.2 A is 0.393 


Isomorphous * Anomalous * Combined Data 
Replacement Data | Scattering Data 
Mean D Spacing Phasing Power Phasing Power Figure of Merit 


wW — 


8.5 
8.1 
8.0 
6.5 
5.7 
5:2 
4.8 
4.5 
4.2 
3.9 
3.6 


pi p pd pd pd pd pd pad pa 
nak r e 


Or NN 


* Ranges differ for isomorphous and anomalous data. ae PNR, 
. eri 

Phasing power = f „/& e (lack of closure of phase triangle see figure 38). Figure o 

which rn Wb de 0 and 1, gives a measure of the correctness of a phase. 


Phases were obtained for 11,807 of the 21,852 non-zero native reflections. The overall 


figure of merit for these phase was 0.393 which indicates a large ambiguity in many of 


them. In order to calculate an electron density map these phases were imported into se 


program X-PLOR, a system of programs for the refinement of crystallographic and NMR 


data, ( Brunger, 1987; Brunger et al. 1990) and an electron density map was calculated. 
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Molecular Boundary Maps 


The resulting electron density map was displayed using the program GRINCH 
(Williams, 1982) on an Evans and Sutherland graphics terminal. The results of this effort 
can be seen in figures 38 - 41. Because of the large size of the map, there were technical 
difficulties with displaying an entire asymmetric unit. It was however possible to display 
parts thereof. Figures 38, 39 and 40 represent 1/8 of the asymmetric unit, while 1/2 the 
asymmetric unit is displayed in figure 41. The molecular boundaries clearly are visible. 
Figures 38, 39 and 40 show different views of the same reigon of the map. The 
orientation in figures 38 and 39 are related to each other by a rotation of 180° around a 
vertical axis in the center of the image. The large mass of electron density in the lower 
right-hand corner of the figure 38 is the same mass as is seen in the lower left-hand corner 
of figure 39. Figure 40 show a rotation of about 45° toward the viewer around a 
horizontal axis of figure 39. It shows more clearly the electron density in the upper 
portions of figures 38 and 39. 

Since figure 41 shows 1/2 of an asymmetric unit, it is much more thickly 
populated than the others. A careful examination reveals three distinct area of electron 
density. This could indicate that there are 6 molecules in the asymmetric unit, but it is 


impossible to be certain because while the boundaries of this electron density are clear in 


some directions, they are not clear in others. 


Taken as a whole these figures show a non-random distribution of electron 


density, with clear indications of molecular boundaries. This is good evidence that the 


RAF : i resents a good 
phases determined are correct within a wide margin of error. This rep g 


starting point for phase enhancement. 
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Figure 38. Photograph of Molecular Boundary Map representing approximately 1/8 of an asymmetric unit. 
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Figure 39. Photograph of Molecular Boundary Map representing approximately 1/8 of an yen unit. 
This figure shows the same region of the map as seen in figure 38. They are related to each other by 
rotation of 180° around a vertial axis in the center of the image 
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Figure 40. Photograph of Molecular Bondary Map representing approximately 18 of an asymmetric unit. 
This the same region of the map as shown in figures 38 and 39. This view is related to that in figure 39 by a 
rotation of about 45° toward the viewer around a horizontal axis of figure 39. 
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Figure 41. Photograph of Molecular Boundary Map representing approximately 1/2 an asymmetric unit. 
This map clearly shows distinct areas of electron density. 


DISCUSSION 


Protein Purification 


A method was found for separating Cpase from other proteins in the Pichia 
medium including a very similar protein of slighly higher molecular weight. FPLC was 
found to be an effective method for the purification of Cpase from Pichia medium. After 
the resuspension of the ammonium sulfate precipitate in Tris buffer, the Pichia medium 
remained dark brown in color, with a strong, unpleasant odor and a sticky texture. In 
addition it had a very high optical density at 280 nm most of which was due to non- 
protein material (see Methods and Results). Several types of chromatographic systems 
were investigated before FPLC using a Mono-S column was determined to be the best 
available system for this application. There were several advantages to this system. First, 
the media components did not adhere to the Mono-S column. Some trouble in this regard 
was encounter with the Zeta Prep cartridge and also with HPLC. The solvent systems 
used with the Mono-S column are simple biological buffers with varying amounts of 
NaCl or other common salts. They did not cause protein denaturation as did the solvent 
system with the C-4 column on the HPLC. In addition, at pH 7.5 the only proteins 
retained were the 43 KDa Cpase and the 43+KDa and 30 KDa proteins, presumed to be 
related to Cpase. It was possible to handle 5 - 25 mg of protein a single run of the FPLC. 
Because the column volumes are small and system run ata slightly elevated pressure, 


column runs could be completed 1 - 4 hours including setup time. ‘in 
i ini ic 
The SDS PAGE mini gel proved to be an effective method for determining 


i advantage of the 
Proteins were located in which fractions from the FPLC. The primary 


i m gel pouring to the start 
mini-gel is the fact that it can be setup and run in 3 - 4 hours, from ge! po 
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of staining. Using the gels to locate Cpase also allowed for detection and assessment of 


other proteins present. 
Crystallization 


Diffraction quality crystals of Cpase were obtained although a steady supply of 
these crystals was never obtained. This is because the native enzyme used in the original 
experiements was available in small batchs which varied significantly one to the next so 
that much experimentation was necessary in order to find crystalization conditions. Cpase 
is very difficult to salt out of solution. Of the salts tested only ammonium sulfate could 
force the water-soluble form of Cpase to precipitate. At least some Cpase preparations 
seemed to increase in solubility with increasing NaCl concentrations. Cpase will come 
out of solution in the presence of PEG. While PEG 400 and 4000 tended to cause only 
precipitation, needle crystals were often observed forming in precipitate obtained with 
PEG 8000. The use of PEG alone gave only small crystals. Much better crystals were 
obtained when a combination of NaCl and PEG was used. In general final concentrations 
of 12 - 15 % PEG 8000 with 0.2 to 0.4 M NaCl ata pH close to 7.5 seemed to work best 
for crystallization. Early experiments with cation exchange chromatography showed two 
overlapping peaks pH 5.5, whereas there was only one peak at pH 7.5. This might 
indicate the existence of two forms of the enzyme with slightly different charges at the 
lower pH and might explain why crystals did not form at pHs below 6.8. None of the 


ich diffracted 
additives tested increased the ability to obtain good crystals. Crystals which di 


i within 
well included those which took several weeks to form and those which appeared 


effective protein concentrations. 


one week. In addition, there seems to be broad range of 
d from heat. No crystals 


' bls 
It is very important to protect Cpase crystals from desiccatt 


‘ ir abili iffract X-rays. High 
exposed to temperature in excess of 85°F retained their ability to diffra ys 


l issolution of growing 
temperature tends to cause the formation of spurs on OF the d 
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crystals. Crystals in a PEG and salt solution exposed to air have been observed to 
dissolve, possibly because as water evaporates, the NaC] Concentration increases causing 
an increase in the solubility of Cpase. The best holding solution for Cpase crystals 
appears to be one with a higher PEG and lower NaC! concentration than the solution in 
which they were grown. 

Thus far only small crystal of the genetically engineered enzyme have been 
grown. These have been too small to carry out X-ray diffraction experiments using the 


area detector. 
Molecular replacement 


R-61 and BL have very little sequence homology and yet their tertiary structures 
are remarkably similar (see Introduction). Because Cpase belongs to the same family of 
proteins as R-61 and BL, it seemed likely that the tertiary structure of Cpase would also 
be similar to that of R-61 and BL Because the atomic coordinates for both R-61 and BL 
were available, the choice of a model for molecular replacement was made between these 
two enzymes as explained in the Methods and Results section. This work resulted in two 
predictions. First that Cpase consists of two domains, a catalytic domain comprising the 
first 280 residues in the sequence and a second, positioning domain which contains the 
remaining 120 residues. It is the hydropathy of the first domain which can be fitted to the 


hydropathy of the R-61 or BL sequences. Since a higher degree of similarity was seen 


we 
between Cpase and BL than between Cpase and R-61, it may be that the three dimensiona 


ion of these 
Structure of Cpase will be more similar to that of BL than to R-61. Evaluation 


Predictions must await the production of a model for this Cpase. 
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Phasing and Map Interpretation 


Using direct methods phases were obtained and a low resolution map was 
calculated. While phases were obtained for approximately half of the reflections in the 
native data set, the figures of merit and the phasing power for these phases are relatively 
low, as might be expected in the case where there is only one derivative available. These 
phases were used to produce a map which shows molecular boundaries clearly in some 
directions, but not as clearly in others. These phase and this map can be used as a starting 


point for further refinement, perhaps using a technique such as solvent flattening. 
Future work 


There are essentially three avenues by which higher quality electron density maps 
of Cpase might be produced. The first is to continue work on the data already collected. 
The second is to use a high energy X-ray source in order to obtain data from the small 


crystals currently available. And the third is to find a way to reproducibly produce a 
pursued simultaneously. 


ar boundaries clearly in 


supply of good quality crystals. All three options can be 


While the current electron density map does show molecul 
proved to the 


à i im 
some directions, they are not as clear as in others. If this map can bei 
ber of 


ic uni the num 
point where each molecule within the asymmetric unit can be located then 
be known. One could then average the 


e. If the molecular positions can be 
rotational and translational 


molecules within the asymmetric unit would 
density to get a better density image of one molecul 


located with more certainty, then it will be possible to find the # 
; ic unit. This might be very 
relationship between the molecules within the asymmetric unit gh 


d translational solutions from among the many 


helpful in selecting the correct rotational an 
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presented in a molecular replacement program such as ile thi 
certainly possible, it will never produce a map to es nly or y 
of resolution of the available data. fama 
| Te no nP have been conducted using the small crystals currently 
available with a high energy X-ray source such as a that available at a synchrotron, it is 
impossible to say how effective this approach would be. It is possible that usable data sets 
could be obtained. It is also possible that the crystals would be rapidly disordered in the 
high energy beam so that no usable data would be obtained. 
The most promising approach is to continue the search for better quality crystals. 
In this regard work can continue with the current Cpase preparation using more exotic 
conditions, among which might be crystal growth in microgravity aboard the space 
shuttle, very long (several months) crystallization periods, and additives not yet tried. 
There is also another, perhaps more interesting approach to acquiring better 
crystals, that is an attempt to crystallize the 30 KDa fragment which was observed during 
the purification of the genetically engineered Cpase. Waxman and Strominger (1979) 
mention the appearance of a 30 KDa fragment upon long exposure of Cpase to a- 
chymotrypsin or trypsin. They further report that this fragment is able to bind penicillin. 


Since Pichia is a fungus, it should not be surprising that it excretes protolytic enzymes 
sible for digesting 43 KDa Cpase to a 


e ratio of the 30 to 43 KDa 


into its environment. Such enzymes may be respon 


30 KDa fragment. As mentioned in Methods and Results, th 
protein increases with storage of the crude Pichia medium. Once other proteins are 
removed from the medium by ion exchange chromatography, there is no further increase 


protein follows Cpase very closely through 


cation exchange chromatography at different pHs even though the elution pattern for 
ent of the purification procedure 


in the 30 to 43 KDa ratio. This 30 KDa 


Cpase is different at different pHs. During the developm 
protein using a penicillin 


an attempt was made to separate the 43 KDa and the 30 KDa 
s. Interestingly 


affinity resin. The resin retained both the 30 KDa and 45 D8 protein 


while a method for removing the 30 KDa protein from the 45 KDa preparations was not 
found, a method (separation by hydrophobic Properties using phenyl superose) was 
discovered by which the 43+ KDa protein could be separated from the 43 KDa protein. 
Sample A which contained the 43+ KDa and 43 KDa and very little 30 KDa protein was 
used in the final large scale preparation, while sample B which contained no 43+ KDa 
protein but had considerable amounts of the 30 KDa protein was not. 

It is interesting to note that the molecular mass ratio of 30 KDa to 43 KDa is 1.43. 
If the predictions made in the Methods and Results section regarding the organization of 
the Cpase molecule are correct, then this molecule has two domains, a catalytic domain of 
about 280 amino acid residues and an positioning domain of about 120 residues in the 
genetically engineered enzyme. The ratio of residues in the catalytic domain to the total 
enzyme is then 280 to 400, or 1.43. If these two domains do in fact exist, then perhaps 
they are connected by are flexible bridge which is vulnerable to attack by proteases and 
which is not in a fixed position, lending variation to the conformation of Cpase. 

This would account for the consistent appearance of the 30 KDa protein when a 
Cpase sample is denatured and also might shed some light on the problems of obtaining 
good quality crystals of Cpase. The method for obtaining water soluble Cpase _— 


i ts 
available since 1979 (Waxman and Strominger, 1979) and one hears reports of attemp 
pase, however no structure has yet been 


m which is not overcome with 


made in various laboratories to crystallize C 
published. This indicates the existence of some proble 


ta i i ible 
standard methods. Researchers have long recognized a problem with poss! 
ic digesti -terminus; however 
heterogeneity arising from the enzymatic digestion of the carboxy-term! 
th the existence of a fl ible bridge between 
perhaps that is a minor problem compared to the exis of a flexible 


soni i ich introduces enough 
the proposed catalytic domain and the positioning domain which in me 
tion of well 
conformational variation into these molecules to make the forma 


crystals unlikely. 
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Kwong et al.(1990) reported an interesting difficulty with the crystallization of a 
recombinant CD4 transmembrane glycoprotein located on the surface of helper T cells. 
This recombinant form contained only the four extracellular domains of the protein. They 
obtained five different crystal forms belonging to three different Space groups none of 
which diffracted to better than 5 A. There was evidence indicating that there might be 
hinges between some of these domains. When a fragment of this protein containing only 
the first two domains and not a supposed hinge region was crystallized, the crystal 
diffracted to 2.2 A. 

It should not be difficult to produce the 30 KDa protein from the 43 KDa as this 
would only require allowing the crude preparation to remain unfrozen for a period of time 
to permit digestion to proceed. However there is an other potentially difficult problem. 
The fact that the 30 KDa protein follows the 43 KDa protein so closely, suggests that the 
protease only nicks the 30 KDa protein and that molecule remains intact until it is 
denatured by SDS. Separating the 30 KDa fragment from the rest of the protein will 
probably require at least partial denaturing and then renaturing of the 30 KDa Cpase 
fragment. The use of a non-denaturing PAGE, or the measurement of molecular mass by 
use of molecular sieve chromatography should determine whether or not the 30 KDa 


protein remains associated with the other fragment. 

ze the 30 KDa protein prove successful and should 
crystals, then perhaps it might be 
find the phases for the 43 KDa 
represent only about 70 % 


Should the attempt to crystalli 
good electron density maps be obtained from those 
possible to use the atomic coordinates for that structure to 
Cpase by molecular replacement. Since this fragment would 
of the mass of the intact molecule there is no guarantee of Success: However, if the 
ults section are correct, the fragment anime 


predictions made in the Methods and Res : 
en be possible to see if Cpase is a two 


better model that either R-61 or BL It would th 


domain protein with hinge between the domains. 
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